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Abstract 

The  main  achievement  is  the  development  of  a  computational  system  which  allow  us  to  differentiate  between 
artificially  generated  textures,  as  well  as  natural  ones  with  the  detection  of  objects  within  the  image.  The 
main  tool  is  the  use  of  mathematical  methods  to  represent  an  image  in  terms  of  texture  and  the  cartoon  area  in 
2D  examples.  We  improve  the  classical  methods  by  detecting  objects  that  have  the  same  average  of  intensity, 
or  may  have  the  same  texture  but  slightly  shifted,  since  the  classical  discrimination  lies  on  the  comparison 
of  the  statistical  distribution  of  the  local  image  intensities  while  for  the  proposed  method  the  comparison  is 
between  the  patches  of  the  surrounding  area.  As  a  consequence,  we  automatically  detect  the  periodicity  of 
the  texture  and  obtain  a  representation  of  the  texture  elements,  as  a  dictionary  of  gabor  functions.  So  far, 
the  latter  has  been  implemented  in  ID  and  we  are  extending  it  to  2D.  This  issue  is  important  since  a  library 
of  textures  may  be  obtained  from  a  database  of  images.  Therefore,  it  is  possible  to  generate  a  classification 
and  a  common  basic  texture  for  each  particular  scenario. 
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1. 


Introduction 


1.1  Background  of  the  project 

Camouflage  plays  an  important  role  in  order  to  increase  the  effectiveness  of  a  mission  and  the  survivability 
of  soldiers.  Camouflage  can  be  defined  as  a  method  for  concealing  personnel  or  equipment  from  an  enemy  by 
making  them  appear  to  be  part  of  the  natural  surroundings.  Therefore,  the  purpose  of  a  given  camouflage 
pattern  is  to  break  the  contour  of  a  target  in  order  to  blend  this  target  with  the  background  where  is  placed. 
To  this  extent,  the  measurement  of  the  performance  of  a  camouflage  pattern  gives  a  quantitative  base  at  the 
moment  comparison  between  different  patterns  in  specific  scenarios. 

The  high  cost  that  entails  choosing  and  implementing  a  new  camouflage  pattern  for  armed  forces,  implies 
that  the  decision  should  be  based  on  carefully  designed  patterns  to  ensure  effectiveness  in  the  largest  set  of 
possible  scenarios,  considering  different  lightness  conditions,  land  surfaces,  weather  conditions,  among  other 
variables.  Hence,  the  development  of  an  automatic  assessment  system  for  camouflage  pattern  is  a  very  useful 
and  cost-saving  tool,  which  complements  human- visual  techniques  based  on  empirical  testing. 


Traditionally,  the  evaluation  of  camouflages  is  based  on  direct  observation  of  an  imagery  collection  by 
a  group  of  observers,  considering  different  conditions  of  lighting,  background  variation,  noise  and  distance, 
among  others.  And  recently,  many  methods  have  been  proposed  to  assess  camouflage  pattern  based  on  cluster 
analysis  or  the  use  of  descriptor  to  quantify  the  similarity  between  the  background  and  the  camouflage  texture 
(structural  similarity) . 

In  the  present  work,  we  propose  a  scheme  using  of  mathematical  modeling,  image  processing  and  human 
vision  models  to  estimate  the  effectiveness  of  the  camouflage  patterns  or  the  probability  to  detect  the  soldiers, 
under  different  parameters  of  design,  background  or  illumination  conditions.  In  particular  we  expect  to 
develop  a  computational  tool  for  assessing  camouflage  patterns  to  be  used  by  the  US  Armed  Forces. 

Since  this  is  a  specific  model  for  the  assessment  of  camouflage,  it  is  possible  to  process  large  quantities  of 
images  at  high  resolution  and  in  the  most  diverse  situations  obtaining  results  similar  to  those  that  might  result 
from  an  equivalent  process  with  human  evaluators.  Contrasted  with  traditional  human  visual  camouflage 
evaluations,  where  the  personnel  spend  several  days  in  the  field,  control  over  environmental  variables  turns  in 
a  severe  drawback,  including  basic  and  unpredictable  lightning  conditions,  weather,  background  variations, 
and  many  others.  Setting  new  standards  for  computational  image  analysis,  combining  mathematical  tools 
and  several  methodologies,  could  significantly  reduce  random  or  uncontrolled  variables  to  a  minimum  by 
artificially  exposing  relevant  information  with  the  consequent  improvement  in  results. 

The  aim  of  this  project  is  to  develop  and  evaluate  indexes  for  the  efectiveness  of  the  camouflage  of  soldiers. 
This  work  may  be  considered  as  the  starting  point  for  a  new  type  of  camouflage  assessment  tool,  reducing  time, 
cost  and  almost  every  aspect  on  current  standard  methodologies  based  on  human  visual  inspection  at  a  final 
long-term  integration.  With  the  use  of  this  evaluation  tool,  in  the  next  steps,  we  plan  to  develop  friendly 
software  for  the  evaluation  process  of  camouflages.  Moreover,  we  will  propose  to  create  a  computational 
camouflage  design  modulus.  This  computational  tool  will  propose  one  (maybe  several)  patterns  for  different 
combat  conditions. 
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1.2  Objectives  and  Goals 

As  stated  in  the  proposal,  the  main  objective  of  our  research  is  to  develop  a  computational  tool,  based 
on  mathematical  models,  computational  algorithms  and  human  vision  models,  to  allow  us  the  assessment  of 
camouflage  patterns,  considering  different  environmental  variables. 

From  the  main  objective  we  can  consider  the  following  specific  goals: 

•  Develop  new  image  processing  techniques  for  the  representation  lower  stages  of  human  vision. 

•  Develop  and  implement  mathematical  algorithms,  based  on  variational  models  of  image  processing,  for 
identifying  images  edges  and  textures. 

•  Build  indexes  for  assessing  the  detectability  of  camouflage  patterns. 

•  To  implement  the  developed  algorithms  on  a  computational  platform  for  assessing  the  effectiveness  of 
camouflage  patterns. 

•  In  the  next  steps,  to  develop  a  desing  modulus  for  new  patterns  and  textures. 


1.3  Summary 

In  the  first  report,  we  delivered  a  bibliographic  survey  on  texture  segmentation  for  using  in  camouflage 
assessment,  considering  a  variational  approach  to  this  problem.  In  particular,  we  made  a  review  of  papers 
related  to  the  following  subjects: 

•  Stochastic- Variational  Model  for  Soft  Mumford-Shah  Segmentation 

•  Nonlocal  Image  and  Movie  Denoising 

•  Nonlocal  Linear  Image  Regularization  and  Supervised  Segmentation 

•  Generalized  Nonlocal  Image  Smoothing 

•  Nonlocal  Mumford-Shah  Regularizers  for  Color  Image  Restoration 

•  Index  Camouflage  Measurements. 

We  note  that  in  the  analysis  of  a  camouflage,  one  of  the  main  elements  is  the  so-called  texture.  The  texture 
can  be  seen  as  a  repetition  of  basic  texture  elements  called  texels  or  textons,  where  the  texels  correspond 
to  the  fundamental  unit  of  texture  space  whose  placement  obey  some  rule.  In  general,  camouflage  can  be 
defined  as  a  method  for  concealing  animals,  military  vehicles,  soldier  or  other  objects  to  remain  unnoticed  by 
blending  with  their  environment.  A  camouflage  pattern  will  be  successful  if  the  visual  system  of  the  observer 
(or  enemy)  is  unable  to  discriminate  (or  segment)  the  two  textures  of  the  background  (environment)  and  the 
target.  Then,  an  objective  method  to  assess  the  effectiveness  of  camouflage  pattern  is  the  texture  pattern 
analysis,  in  particular,  the  texture  segmentation. 

The  objective  of  the  algorithm  is  to  recover  both  image  structures  by  using  our  nonlocal  approach. 

As  showed  in  the  previous  figure,  the  combination  of  the  non-local  means  and  the  non-local  Mumford-Shah 
regularization  may  improve  the  performance  of  finding  the  contours  between  regions  with  different  textures. 

For  more  details,  we  refers  to  the  following  sections  of  the  report. 

The  two-dimensional  case... 

Index  for  the  evaluation  of  the  camouflage  patterns... 
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2.  The  texture  segmentation  problem 


As  the  final  objective  of  the  project  is  to  calculate  an  index  of  the  performance  of  camouflage,  a  powerful 
segmentation  of  texture  algorithm  is  necessary.  For  a  good  segmentation  algorithm,  texture  analysis  is 
important  in  order  to  better  achieve  a  relieable  processing.  Techniques  having  texture  analysis  may  be 
divided  into  five  groups  [11]:  structural,  statistical,  signal  processing,  model-based  stochastic,  morphology 
based,  artificial  intelligence  and  variational  PDEs: 

•  Structural:  where  a  given  texture  is  generated  by  a  set  of  texture  primitives  related  by  a  set  of  rules 

•  Statistical:  uses  numerical  indexes  such  as  contrast,  energy,  entropy  and  relationship  between  pairs  of 
pixels  in  a  certain  spatial  orientation  with  given  gray-levels. 

•  Signal  Processing:  usually  is  related  with  the  use  of  the  Fourier  Transform  where  characteristics  such 
as  periodicity  and  directionality  are  considered.  Many  times,  the  use  of  filters  is  also  included. 

•  Stochastic:  where  2D  ARM  A  models  or  Markov  Random  fields  are  usually  applied. 

•  Morphology:  through  the  morphological  operators  with  structuring  elements  having  different  shapes 
are  applied  in  order  to  discriminate  between  various  textures. 

•  Artificial  Intelligence:  the  use  of  an  Artificial  Neural  Network  trained  with  back-propagation  has  been 
used  as  classifiers. 

•  Variational  PDEs  (Partial  Differential  Equations):  the  minimization  of  a  functional  is  used  to  obtain  a 
non-linear  PDE.  The  solution  involves  finding  the  contours  of  the  different  objects  within  the  image. 

For  the  present  work,  we  will  be  focused  on  a  combination  of  Variational  PDE  methods  together  with  some 
statistical  properties  of  the  textures. 

Other  approches  such  as  neural-networks  will  not  be  considered  due  to  the  fact  that  the  training  for 
detection  have  to  be  on  a  specific  set  of  textures.  On  the  other  hand,  it  is  important  to  recall  that  the  final 
goal  is  to  obtain  a  performance  index,  therefore  black-box  approaches  are  not  that  well  suited  for  these  kind 
of  tasks. 

2.1  Variational  PDE  models 

As  previously  mentioned,  the  variational  PDE  models  consists  of  defining  a  cost  function  (the  functional), 

where  the  unknown  variables  are  the  pixels  of  an  image  or  set  of  pixels  of  more  than  one  function  (e.g.  image 

and  contour).  Because  this  problem  is  solved  by  obtaining  the  minimum  and  the  number  of  variables  is  of 
the  order  of  the  size  of  the  image,  the  use  of  efficicient  optimization  algorithms  is  necessary.  Discretization 
issues  should  also  be  considered  in  order  to  obtain  results  of  better  quality.  A  general  introduction  is  found 
in  [6]. 

An  example  of  these  type  of  algorithms  is  the  calculation  of  the  Total  Variation  of  a  given  function 

min  TV  [u,  K\ uq]  =  min  <  /  (u  —  Uo)2dfl  +  a  /  |Vit|df2>  (2.1) 

“  K’u  [Jn  Jn  J 

as  is  exposed  in  [5].  The  first  term  corresponds  to  the  similarity  term  and  is  the  one  in  charge  of  keeping  the 
output  image  close  to  the  input  one.  The  second  term  is  the  smoothing  (or  regularization)  term,  in  charge 
of  the  smoothness  of  the  output  image.  The  variable  to  minimize  is  the  whole  output  image. 

More  mathematical  details  are  to  be  foun  in  Appendix  B  and  the  summaries  of  the  some  state-of-the-art 
work  involving  non-local  segmentation  considering  the  variational  approach  (as  in  Report  Dec  2011)  are  also 
included  in  the  appendix. 
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2.2  The  non-local  means 


The  key  idea  here  is  that  this  formulation  estimates  the  value  of  x  as  an  average  of  the  values  of  all 
pixels  whose  Gaussian  neighborhood  looks  like  the  neighborhood  of  x,  this  is  important  because  with  this 
formulation  we  can  take  advantage  of  the  high  degree  of  redundancy  of  natural  images,  or,  in  other  words,  we 
can  detect  textures  and  preserve  it,  instead  of  consider  them  as  noise,  like  the  usual  algorithms  for  denoising 
does.  For  more  detailed  information  please  refer  to  section  B.2  at  the  appendix. 


2.3  Non-local  variational  image  models 


We  will  focus  on  the  variational  formulation  of  the  segmentation  problems  where  the  (smoothness)  regu¬ 
larization  measurements  consider  the  neighbourhood  of  the  pixel  [10].  The  energy  of  the  functionals  define 
the  forces  of  the  active  contours,  but  the  non-local  versions  of  some  classical  active  contour  approaches  use 
the  notion  of  patch  distance  instead  of  usual  point-wise  difference  of  luminosity. 

Because  the  main  modification  is  in  the  gradient,  we  focus  on  the  classical  definition  of  the  derivative  of 
the  function  /  :  [a,  b]  — >  (—00,00),  in  a  point  ato  £  [a,  b\,  giving  local  information  about  the  behavior  of  the 
function.  Actually,  \f'(xo)\  is  a  measure  of  the  function’s  variation  in  the  point  Xq.  Then  is  natural  to  think 


that 


f  /O)  -  fjy) 
V  \x-y\ 


2 

$x(y)dy 


(f(x)  -  f(y))2 


(  $*(y)  \ 
\\x-v\V 


dy, 


(2.2) 


where  5X  is  the  Dirac  function  concentrated  at  x,  the  idea  of  the  non-local  is  consider  a  more  general  expression 
for  (2.2).  Therefore,  we  replace  5x(y)  — »  w(x,y).  If  we  choose  a  specific  location  x,  the  repetition  of  the 
texture  will  be  at  the  y  positions  where  w(x,  y)  has  higher  values. 
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3.  The  Index  Measurement 


One  of  the  main  issues  of  this  research  project,  is  the  elaboration  of  an  index  for  measuring,  in  some 
sense,  wheter  or  not  an  object  in  a  natural  image,  can  be  easily  recogniced.  Our  aim  is  to  develop  an  index 
associated  to  a  given  image  under  study,  such  that  high  values  of  the  index  means  that  the  object(s)  in  the 
image  can  be  easily  recogniced  by  a  human  observer.  We  propose  a  definition  and  computation  of  an  index 
based  on  the  non-local  Mumford-Shah  framework.  In  this  context,  the  two  main  factors  that  determines 
the  capability  of  human  vision  to  recognize  objects  in  natural  images,  are  the  change  in  intensity  level  and 
change  in  texture  of  the  object  respect  to  the  background.  We  define  the  Contour-Index  as  the  change  of 
texture  and  intensity  level  along  the  object’s  contour  and  we  compute  it  based  on  the  results  from  the  non¬ 
local  Mumford-Shah  algorithm.  We  describe  below  the  main  components  of  the  process  and  in  the  appendix 
section,  we  will  provide  the  mathematical  definitions  and  discuss  implementation  details. 

3.1  Previous  Steps 

3.1.1  Implementation  of  the  non-local  Mumford-Shah  regularization 

Given  an  initial  image  u o,  we  consider  the  variational  regularization  problem  consisting  in  finding  a 
regularized  image  u*  and  its  associated  soft-characteristic  border  function  v*  :  12  — >  [0,1]  such  that  ( u*,v *) 
minimizes  the  Mumford-Shah  (MS)  cost  functional 

Fmsnl(u,v)  =  Fsirn(u)  +  aFRegL2NL(u,v)  +  ^Fat{v)  (MS),  (3.1) 

where  Fsim(u)  =  fn(u  —  uo)2dx  is  the  similarity,  FRegR2NL(u ,  v )  =  fQ  v2(x)\\7  wu\2(x)dx  is  the  MS  regulariza¬ 
tion  term,  Fat{v)  =  Jfi(e|Vv|2  +  ^jp~)dx  is  the  Ambrosia- Tortorelli  term  and  a, 7  are  positive  parameters 
for  controlling  the  relative  importance  among  the  different  terms. 

We  also  consider  an  alternative  of  the  previous  problem,  based  on  the  Total  Variation  (TV)  cost  functional 

Ftvnl{u,v)  =  Fsirn(u )  +  aFRegLiNL(u,v)  +  7 Fat{v )  (TV),  (3.2) 

where  FRegL iNL  =  fQ  v2(x)\Vwu\(x)dx  is  the  TV  regularization  term. 

3.1.2  Minimization  of  MS  cost  functional  FMsnl 

In  this  section  we  will  provide  numerical  implementation  details  for  the  minimization  of  Fmsnl  defined 
in  (3.1).  Observe  that  Fmsnl  is  differentiable  and  Fmsnl{av)i  Fmsnl{u,  ■)  are  convex,  therefore  a  gradient 
based  alternating  minimization  scheme  is  well  suited  for  minimizing  Fmsnl ■  Our  algorithm  takes  as  input 
an  initial  image  uq  and  an  initial  characteristic  function  vo  (for  instance,  vq  =  1).  In  a  first  step,  the  weight 
function  w  is  computed,  based  on  the  initial  image  uq.  Then,  a  gradient  descent  alternating  minimization 
scheme  is  carried  out,  considering  K  outer  iterations  and  J  inner  iterations.  For  each  (outer)  iteration 
k  =  1,..., K,  we  compute  Uk  by  gradient  descent  minimization  of  F{-,Vk~i)  with  step  <5  and  J  (inner) 
iterations.  Then  we  compute  Vk  by  gradient  descent  minimization  of  F(uk,-)  with  the  same  step  <5  and  the 
same  number  of  inner  iterations  J.  The  algorithm  is  summarized  in  Algorithm  3.1. 

3.1.3  Gradient  of  Fmsnl 

When  computing  the  Gateaux  derivatives  of  the  different  terms  that  conforms  Fmsnl  with  respect  to 
smooth  functions  g,  h  :  12  — >  ffi.  with  compact  support,  the  Gateaux  derivative  of  Fsim  with  direction  g  is 
given  by 

DFsim(u)(g)  =  2  /  (u(x)  -  u0{x))g(x)dx. 

Jn 

Also,  DFat{v){H)  =  2e  fQ  Vv(x)Vh(x)dx  +  £  Jq(v(  x)  —  1  )dx.  Now,  recall  the  gradient-divergence  adjoint 
relation  fQ  f(x)  ■  X7h(x)dx  =  —  fn  div  f(x)h{ x)  where  /  :  12  — >■  R9.  Hence,  taking  /  =  Vv,  we  obtain 

DFat(v)(F)  =  —  2e  f  Av(x)h(x)dx  +  —  f  (v{x)  —  l)dx. 

Jn  2e  Jn 
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Algorithm  3.1  Alternating-Gradient  NLMS 
Require:  uq  initial  image  and  vq  initial  characteristic  function. 
Ensure:  u  regularized  image  and  v  final  characteristic  function. 
Compute  sparse  weights  w  using  initial  image  u o 

for  k  =  1 , . . . ,  K  do 

Uk  :=  Uk-i  ,  Vk  :=  Vk-i 

for  j  =  1, . . . ,  J  do 

gu  :=  DuFMSNL(uk,vk ),  according  to  (3.9) 

Uk  ‘-=  Uk  &Qu 

end  for 

for  j  =  1, . . . ,  J  do 

9v  ■=  DvFMSNL(uk,Vk ),  according  to  (3.10) 

Vk  -=  Vk  Sgv 

end  for 
end  for 

u  :=  uk  ,  v  :=  vk 


On  the  other  hand,  the  partial  derivative  of  FRegL2NL  respect  to  u  with  direction  g  is  given  by  DuFRegL2NL{u,  v)(g) 
2  fn  fn  v2(x)Vwu(x,  y)Vwg(x,  y)dydx.  By  the  non-local  version  of  the  gradient-divergence  adjoint  relation 
In  IqP(x’  y)Vwu(x,  y)dxdy  =  —  fn  di vwp(x)u{x)dx,  we  obtain 


DUFRegL2NL  (u,  V )  (g) 


d\vw{v2W  wu)(x)g{x)dx. 


Ja 

Now,  the  partial  derivative  of  FRegR2NL  respect  to  v  with  direction  h  in  given  by 


DvFRegL2NL(u,v)(h)  =  2  /  v{x)\S7wu\2{x)h{x)dx. 

Jn 

From  the  previous  equations  we  obtain  the  partial  derivatives  of  E: 

DuFMsnl(u,v)  =  2 (u  -  u0)  -  2adivw(v2'V wu) 
DvFMsnl{u,v)  =  2av\Vwu\2  -  2^eAv  +  ^{v  -  1). 
We  recall  that,  the  definition  of  the  non-local  gradient  is 


(3.3) 

(3.4) 


|VM,u|2 


(u(y)  -  u(x))2w(x,  y)dy 


Jn 

and  taking  p{x,y)  —  v2(x)Vwu(x,  y)  in  the  definition  of  the  non-local  divergence  dhvP 
p(y,  x))sjw{x,  y)dy ,  with  div^p  :  Q,  — >  R  of  a  function  p  :  fl  x  Q  — >  R,  we  have 


(3.5) 
fQ(p(x,y)  - 


di vw(v2Vwu)(x)  =  /  (u(y)  -  u(x))(v2( x)  +  v2(y))sjw(x1y)dy  (3.6) 

Jn 

Local  and  non-local  gradient  based  algorithms  are  capable  of  detecting  edges  in  natural  images.  Local 
gradient  is  based  only  on  local  changes  (with  the  neighbour  positions)  in  intensity  level.  Therefore,  one 
important  limitation  of  the  local  gradient,  is  that  it  detects  edges  within  a  texture  that  might  not  correspond 
to  the  contour  of  an  object.  In  contrast,  non-local  gradient  is  based  on  non-local  information  extracted  from 
the  image,  therefore,  it  can  detect  a  contour  when  there  is  a  difference  between  the  textures  inside  and  outside 
the  object,  even  if  there  is  no  apparent  jump  in  gray  level  intensity.  The  non-local  framework  considers  that 
textures  are  characterized  by  the  repetition  of  a  pattern  within  a  determined  neighborhood.  Such  patterns 
are  associated  to  small  connected  regions  or  patches  in  the  image,  therefore  it  is  necessary  to  define  a  proper 
notion  of  distance  between  patches  for  computing  the  non-local  gradient  of  an  image.  Such  distance,  leads 
to  the  concept  of  weight  function  w,  wic.h  can  be  considered  as  an  intrinsic  descriptor  of  the  texture  changes 
in  the  image.  In  the  appendix  section  we  will  discuss  these  concepts  in  more  detail. 

The  norm  of  the  non-local  gradient  ||  VtI)u(£)||  at  a  given  point  of  x  of  the  image  u,  is  a  measure  of  texture 
discrepancy  within  a  neigborhood  of  the  point.  An  small  value  of  this  measure,  implies  that  the  texture  is 
homogeneous  in  a  neighborhood  of  the  point  and  a  large  value  means  that  a  texture  change  may  ocurr  in  a 
neighborhood  ot  the  point,  indicating  the  presence  of  an  adge  contour. 
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3.1.4  Object  Contour  Detection: 

We  detect  the  object’s  contour  based  on  the  11011-local  Mumford-Shah  algorithm.  This  algorithm  is  based 
on  the  concept  of  non-local  gradient  norm,  described  above,  therefore,  it  is  able  to  deal  with  images  with 
texture  changes  between  the  object  and  background.  The  algorithm  tries  to  identify  regions  in  the  image 
where  || Vlu«(a;)||  is  small,  leading  to  the  identification  of  different  regions  according  to  the  different  textures 
in  the  image. 


3.2  Computation  of  the  Index 

3.2.1  Contour- index  Computation: 

Once  the  object’s  contour  has  been  detected,  the  following  step  is  to  calculate  the  contour-index  I  that 
we  define  as  the  average  of  the  norm  of  non-local  gradient  ||Vll,w.(£)||  along  the  contour.  Observe  that  if  a 
low  value  of  I  means  that  the  texture  inside  and  outside  the  object’s  contour  C  is  similar.  Remmark  that  the 
contour-index  I  depends  011  the  weight  function  w  and  the  contour  C  and  the  correct  estimation  of  them  is 
a  difficult  task  as  it  involves  several  parameters  that  model  different  aspect  of  the  image  textures. 


3.2.2  Implementation  of  the  calculation  of  the  index 

The  continuous  setup  Given  an  image  u  and  the  contour  of  the  object  C  C  fi,  we  define  I(u,C)  as  the 
mean-integral  of  the  norm  of  the  non-local  gradient  over  the  contour,  that  is, 

I{u,C)  =  j  | S7wu\(x)dn(x),  (3.7) 

where  1(C)  is  the  length  of  C  and  /i  is  the  contour  measure.  We  observe  that,  if  the  texture  in  the  interior 
region  of  C  is  similar  to  the  texture  in  the  exterior  region  of  C  ,  then  I(u,C)  would  be  small.  Similarly,  in  the 
case  of  large  texture’s  dissimilarities  between  the  interior  and  exterior  region  of  C  then  I(u,C)  will  be  large. 
We  also  note  that  the  normalization  factor  jFj  makes  possible  the  comparison  of  the  index  of  the  object 
using  different  contour’s  length.  We  approximate  I(u,C)  in  3.7  by 

T(  *  _  In \Vwu*\(x)dx-  FRegL2NL(u*,v*) 

(U  ,V  ’  FAT(v*) 

where  (u*,v*)  are  the  minimizers  of  the  Mumford-Shah  cost  functional  3.1,  F^egLiNL  is  the  MS  regularization 
term  and  FAt'\ s  the  Ambrosio-Tortorelli  term.  Hence,  we  have 


I(u*,v*) 


fQ(  1  -  v*2(x))\Vwu*\(x)dx 
/Q(e|Vu*(a:)|2  +  {v*{x}~1)2  )dx 


(3.8) 


The  discrete  setup  Now  we  will  focus  in  the  case  of  2-dimensional  images,  i.e.  q  =  2.  We  follows  the 
implementation  of  the  non-local  weights  w  in  [14].  For  each  pixel  i  we  consider  a  neighborhood  JVj  including 
the  s  most  similar  pixels  (according  to  the  distance  d)  within  a  search  window  of  size  t.  Imposing  symmetry 
of  w  leads  to  at  most  2s  elements  in  AT*.  By  making  w[i,j]  =  0  if  j  £  Ni,  an  sparse  implementation  of  w  and 
the  consequent  savings  in  memory  consumption  are  obtained.  Let  «[*]  be  the  image  value  at  pixel  location  i 
with  i  =  1 ,n.  The  discretized  version  of  the  partial  derivatives  (3.9)  and  (3.10)  are  given  by: 


DuFMSNL(u,v)[i]  =  2 (u[i]  -  u0[i])  -  2adi vw(v2X7 wu)[i\  (3.9) 

DvFMSnl(u,v)[i\  =  2av[i\\V wu\2[i]  -  2jeA v[i]  +  ^(v\i\  -  1).  (3.10) 

From  (3.5)  and  (3.6), 

|Vu,u|2[f]  =  (ub1  -  u[i])2w[i,j].  (3.11) 

jeNi 

div„,(r;2Vu,u)[z]  =  Y  (ub1  ~  «[*])(^2H  +  v2\j])y/w[i,j]  (3.12) 

j&Ni 
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Similarly,  the  discretized  version  of  3.8  is 


I(i 


*)  = 


E*  Ejeiv^1  -  v*2[i\)(u*\j]  -  u*\i])2w[i,j] 


Ei(e|Vw*[*]|2  + 


4e 


) 


(3.13) 
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4.  Some  numerical  examples 
4.1  A  first  approach  in  ID 

In  order  to  better  understand  segmentation  in  textured  images,  we  have  to  first  study  a  simplified  situation. 
For  that  reason  we  created  some  synthetic  examples  in  ID  in  order  to  have  an  initial  idea  of  the  influence  of 
the  different  parameters  on  the  result  of  the  detection. 

We  started  by  having  function  with  a  ID  region  (element)  having  an  average  value  higher  than  the  rest  of 
the  ID  domain  (background).  In  the  first  stage  we  assume  that  the  global  reflected  (signal)  light  intensity  of 
the  element  is  different  from  the  background.  If  this  object  has  no  texture,  assuming  homogenous  materials, 
we  have  that  several  different  already  proposed  algorithms  are  capable  of  recovering  the  original  edges  of  the 
objects.  That  is  the  case  of  the  Total  Variation  (TV)  filter,  and  the  basic  Mumford-Shah  algorithm  with  the 
Ambrosio-Tortorelli  approximation  (MSAT). 

The  Total  Variation  filter  consisits  of  an  input  image  uo(-)  that  will  be  regularized  by  a  L\  norm  term  for 
the  gradient  of  the  denoised  image  as  in  equation  2.1,  where  the  problem  consists  of  finding  the  image  u(-) 
that  minimizes  the  functional  F(u).  On  the  other  hand,  the  MSAT  functional  is  the  local  specific  version  of 
the  non-local  Mumford-Shah  functional,  where  |Vu,ii|  is  replaced  by  the  usual  gradient  |Vw|  . 

Preliminary  numerical  implementation  for  non-local  segmentation.  For  the  first  synthetic  example,  which 
is  a  one  dimensional  case,  we  can  have  a  more  graphical  idea  of  the  performance  of  the  different  algorithms. 
We  start  with  an  object  with  constant  signal  level  (1.0)  inserted  in  a  black  background  (level  0.0).  We  add 
a  simulated  texture  which  is  a  sinusoidal  signal  with  amplitude  0.1. 

For  the  first  synthetic  example,  in  order  to  have  a  more  graphical  idea  of  the  performance  of  the  different 
algorithms,  we  start  with  an  object  with  constant  signal  level  (1.0)  inserted  in  a  black  background  (level  0.0). 
The  difference  with  the  usual  examples  is  that  we  add  a  simulated  texture,  where  we  tested  two  cases: 

•  a  sinusoidal  signal  with  amplitude  0.1 

•  a  periodic  squared  signal  with  amplitued  0.2 

functions  that  were  added  to  the  previous  generated  signal.  The  above  signal  reproduce  the  two  effects  in  the 
image.  The  big  jump  represents  the  discontinuity  of  the  image  that  can  be  obtained  by  classical  segmentation 
methods.  The  small  oscilations  of  the  signal  represents  the  texture  of  the  image  which  is  a  periodic  structure. 

We  consider  additive  gaussian  (white)  noise  in  order  to  simulate  a  signal  that  would  be  acquired  in  a 
more  real  context.  The  corresponding  generated  signals  are  shown  in  Figures  4.1(a)  and  4.2(a),  respectively. 

The  denosing  problem  consists  of  recovering  the  original  signal,  filtering  the  noise,  but  keeping  the  sharp 
transitions,  and  the  idea  is  to  also  recover  the  added  periodic  texture. 

For  the  simulations  and  denoising,  we  implemented  the  following  cases: 

•  regularization  with  the  L2  norm  of  the  gradient,  corresponding  to  a  convolution  with  a  laplacian  kernel, 

•  regularization  with  the  LI  norm  of  the  gradient,  problem  known  as  the  Total  Variation, 

•  regularization  with  the  Mumford-Shah  functional,  using  the  Ambrosio-Tortorelli  approximation, 

•  regularization  with  the  non  local  shift  invariant  wighted  gradient,  for  the  Mumford-Shah  functional, 

•  the  non-local  means  filtering. 

From  the  results  shown  in  Figures  4.1  and  4.2  we  can  obtain  the  following  preliminary  observations: 

•  the  linear  L2  norm  of  the  gradient  regularization  is  not  capable  of  recovering  the  sharp  transitions 
(edges  of  the  objects).  Whereas,  the  non-linear  filters  such  as  the  Mumford-Shah,  Total  Variation  and 
Non-Local  means  are  capable  of  keeping  the  sharp  transitions. 

•  the  Mumford-Shah  based  algorithms  (local  with  Ambrosio-Tortorelli  approximation  and  non-local  shift- 
invariant  gradient )  are  not  capable  of  recovering  the  texture. 

•  The  non-local  means  can  recover  the  sharp  transitions  and  is  able  to  detect  texture,  but  greately 
diminishes  the  contrast  of  this  texture. 
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Figure  4.1:  (a)  original  synthetic  signal  with  and  without  noise,  (b)  regularization  with  the  L2  norm  of  the 
gradient  (a  =  20),  (c)  regularization  with  the  L\  norm  of  the  gradient  (a  =  0.5)  (d),  regularization  with 
Mumford-Shah  ( a  =  50,7  =  1-®  —  4,e  =  0.5)  (e)  regularization  with  non-local  Mumford-Shah  ( a  =  10,7  = 
IE  —  4,6  =  0.5,  quadratic  spline  m  =  3),  and  (f)  non-local  means  [h  =  0.5,  a  =  0.5,  m  =  10) 


(d) 


(e)  (f) 


Figure  4.2:  (a)  original  synthetic  signal  with  and  without  noise,  (b)  regularization  with  the  L2  norm  of  the 
gradient  (a  =  20),  (c)  regularization  with  the  L\  norm  of  the  gradient  (a  =  0.5)  (d),  regularization  with 
Mumford-Shah  ( a  =  50,7  =  1-E  —  4,e  =  0.5)  (e)  regularization  with  non-local  Mumford-Shah  ( a  =  10,7  = 
IE  —  4,e  =  0.5,  quadratic  spline  m  =  3),  and  (f)  non-local  means  (h  =  0.5,  a  =  0.5,  m  =  10) 
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(a) 


« 


(b) 


(c) 


(d)  (e)  (f) 

Figure  4.3:  (a)  image  regularized  with  usual  Ambrosio-Tortorelli  approximation  of  MS  functional,  (b)  with 
non-local  gradient  with  B-spline  of  order  n=4  and  expansion  m=3,  non-local  gradient  with  expansion  m= 5, 
(d),  (e)  and  (f)  the  corresponding  contours  v  of  (a),  (b)  and  (c),  respectively. 


•  The  L i  norm  of  the  gradient  regularization  can,  to  a  certain  degree,  recover  the  texture  when  its  regions 
have  constant  gray  levels  (flat  regions) 

•  The  outputs  of  the  local  Mumford-Shah  and  non-local  Mumford-Shah  regularizations  give  more  infor¬ 
mation  than  the  other  methods  since  we  have  the  additional  information  of  the  contour  of  the  objects 
within  the  image. 

From  the  previous  observations,  we  can  conclude  that  a  combination  between  the  idea  of  the  non-local  means 
and  the  non-local  Mumford-Shah  regularization  may  improve  the  performance  of  finding  the  contours  between 
regions  with  different  textures. 

4.2  The  numerical  solution  for  2D 

4.2.1  First  approach  with  non-textured  images 

The  images  of  the  numerical  implementation  where  we  have  the  MS  Local  Regularizer,  and  the  non-local 
regularized  images.  As  we  can  see  in  Figure  4.3,  the  image  is  softened  as  we  increase  the  width  of  function  /(•), 
while  the  transition  of  the  contour  function  v  is  preserved,  but  having  a  wider  width.  This  last  consequence 
implies  an  easier  contour  finding  procedure. 

Given  the  preliminary  results,  in  the  future  work,  we  will  tackle  the  problem  of  regularizing  images  with 
texture.  Therefore,  we  will  continue  with  the  study  of  the  weight  function  ui(-)  for  different  cases.  We  believe 
that  some  clue  of  the  texture  information  will  be  embedded  in  this  weight  function. 

4.2.2  Experiments  with  Textured  Images 

The  non-local  MS  regularization  was  applied  to  a  set  of  artificial  and  natural  images,  the  latter  taken 
from  the  the  Berkeley  Segmentation  Dataset  [16].  Also,  we  illustrate  the  benefits  of  non-local  MS  over  local 
MS.  In  our  experiments,  we  realize  that  the  quality  of  the  results,  highly  depends  on  the  parameters  involved 
in  the  computation  of  the  weight  function:  patch  size  (to),  window  search  size  (s),  numbers  of  neighbors 
(t),  scale  parameter  for  weight  function  ( h )  and  apodization  function  (A).  These  parameters  model  different 
aspects  of  the  image  texture,  thus  are  intrinsic  to  each  image  and  we  have  determined  experimentally. 

After  we  have  computed  the  wight  function  w  we  carry  out  the  minimization  of  3.1  using  the  gradient 
descent  alternating  minimization  algorithm  described  in  Section  3.1.2.  This  algorithm  also  depends  on  several 
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parameters  that  we  determine  experimentally  for  each  image:  weight  of  regularization  term  (a),  weight  of 
Ambrosio-Tortorelli  term  (7),  gamma  convergence  parameter  (e),  gradient  descent  step  (5  ),  number  of  outer 
iterations  ( K )  and  number  of  inner  iterations  (J).  We  also  compute  the  local  MS  segmentation  in  order  to 
compare  with  the  non-local  MS  results.  Figure  4.4  shows  4  artificial  images  (first  row)  and  the  segmentation 
results  after  the  non-local  (second  row)  and  local  (third  row)  MS  algorithm.  Each  image  consist  of  an 
object’s  texture  and  a  background’s  texture.  We  have  created  such  texture  by  the  regular  replication  of  a 
given  pattern.  We  observe  that  the  non-local  MS  algorithm  successfully  detects  the  object’s  contour.  In 
contrast,  the  local  MS  algorithm  detects  edges  within  texture  so  is  not  able  to  detect  object’s  contour.  The 
first  row  in  Figure  4.5  shows  the  same  images  as  in  the  first  row  of  Figure  4.4  after  the  addition  of  random 
Gaussian  noise.  Second  and  thirds  rows  of  Figure  4.5  show,  non-local  and  local  MS  segmentation  results 
respectively.  We  observe  that  the  presence  of  noise  affects  the  capability  of  the  algorithm  to  detect  the 
object’s  contour.  Figure  4.6  shows  3  natural  images  (first  row)  and  the  segmentation  results  after  the  non¬ 
local  (second  row)  and  local  (third  row)  MS  algorithm.  In  contrast  to  the  artificial  images,  natural  images 
exhibits  textures  with  a  complicated  pattern  structure.  Such  patterns  present  differences  in  gray  level,  shape 
and  sizes.  To  partially  overcome  this  issue,  a  Gaussian  smoothing  filter  preprocessing  step  is  applied,  to 
makes  such  features  more  homogeneous.  We  observe  that  non-local  MS  algorithm  performs  better  than  the 
local  MS  algorithm,  in  the  identification  of  contours  in  areas  with  a  change  of  texture.  O11  the  other  hand, 
the  local-MS  seems  to  perform  better  in  the  identification  of  contours  in  areas  with  a  change  of  intensity 
level. 


Figure  4.4:  artificial  image  no  noise. 
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Figure  4.5:  artificial  image  with  noise. 


Now,  we  asses  behavior  of  the  contour-index  described  in  Section  3.2.2  by  considering  an  artificial  image 
with  different  levels  of  noise.  Figure  4.7  exhibits  from  left  to  right,  an  artificial  image  with  no  noise  and 
additive  random  Gaussian  noise  with  variance  20,  40,  60  and  80  respectively.  The  respective  contour-index 
values  are  1.83,  0.58,0.19,  0.09  and  0.04.  Observe  that  the  contour-index  value  decrease  with  the  level  of 
noise  and  the  object  is  more  difficult  to  be  distinguish  visually. 
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Figure  4.6:  natural  image  no  noise 


Figure  4.7:  index  noise 
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5.  Concluding  Remarks  and  Future  Work 


So  far,  none  of  the  tested  methods  give  a  satisfactory  result  in  terms  of  recovering  the  original  texture, 
from  an  image  with  noise.  It  is  expected  that  a  combination  of  some  of  those  methods  would  greately  improve 
the  performance.  However,  an  efficient  numerical  implementation  of  such  a  combination  of  algorithms  is  not 
straightforward. 

For  the  next  stage  we  consider  the  issue  of  finding  the  local  periodicity  between  the  texture  elements 
(textels).  The  local  periodicity  may  be  obtained  from  the  information  provided  by  the  analysis  of  the  weight 
function  of  the  non-local  Mumford-Shah  functional.  While  the  textels  should  be  expressed  in  terms  of  a 
multiscale  base  such  as  the  one  provided  by  wavelets. 

On  the  other  hand,  the  images  to  be  tested  will  have  to  include  some  camouflage-like  textures. 

5.1  Implementation  of  algorithms 

As  pointed  out  in  Section  4,  the  next  step  involves  the  implementation  of  the  minimization  of  non-local 
Mumford-Shah,  including  the  optimization  of  the  weight  function  w(x,  y).  This  would  enable  a  more  accurate 
definition  of  the  countours  of  the  objects  with  textures.  Our  assumption  is  that  the  optimization  involving 
the  weight  function  would  improve  the  performance  of  the  already  proposed  methods. 

5.2  Segmentation  Problems  and  Textured  Images 

In  a  future  report  some  modifications  will  be  added  in  order  to  consider  the  segmentation  of  different 
textured  regions,  but  with  a  similar  brightnes  levels.  In  the  current  stage,  we  tested  some  of  the  algorithms  in 
ID  with  artificial  texture,  being  sinusoidal  or  staircase,  but  different  average  gray  levels.  Since  this  analysis 
was  based  on  the  ID  simulations,  still  the  extention  to  2D  has  to  be  considered,  since  this  extention  is  not 
always  straightforward.  The  examples  given  in  this  work  present  less  difficulties,  since  many  algorithms  are 
able  to  segment  by  taking  advantage  of  the  different  mean  levels  of  each  texture.  Therefore,  more  difficult 
images  will  be  tested,  where  the  biggest  difference  between  the  textures  will  be  in  the  periodicity  or  the  phase 
of  the  texels.  Special  attention  will  be  payed  to  this  last  issue  of  change  of  phase  in  the  texels,  since  this  may 
be  considered  as  a  change  in  the  texture.  Therefore,  this  discontinuity  would  mean  the  border  of  an  object 
within  the  measured  image. 

5.3  Calculation  of  the  index  of  the  camouflage 

Once  the  countour  is  established  by  one  of  the  algorithms  described  in  the  report,  the  numerical  (dis)GONTiNUiTY 
index  has  to  be  calculated,  considering  the  characteristics  of  the  texels  or  difference  between  the  textels  along 
the  border.  The  relation  between  this  index  and  the  reaction  of  a  human  observer  has  to  be  checked,  since 
this  is  part  of  the  original  idea  of  the  first  stage  of  the  design  a  camouflage  pattern. 
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A.  Some  concepts  about  texture 


Try  to  characterize  the  texture  concept  takes  relevance  when  we  want  to  camouflage  something,  because 
our  first  impression  is  that  the  human  eye  recognizes  the  change  between  different  textures,  see  Figure  A. 


Figure  A.l:  The  change  between  different  textures 

Define  texture  is  a  difficult  task,  because  we  don’t  have  a  precise  and  unique  mathematical  definition  or  a 
clear  concept.  The  basic  idea  is  that  the  texture  can  be  seen  as  a  repetition  of  basic  texture  elements  called 
texels  or  textons  made  of  pixels  whose  placement  obey  some  rule. 

Let  us  give  some  recompilations  ideas  about  textures: 

1.  The  textured  region  can  contain  texture  elements  of  various  sizes,  each  of  which  can  itself  be  textured. 

2.  The  order  consists  in  the  nonrandom  arrangement  of  elementary  parts. 

3.  The  parts  are  roughly  uniform  entities  having  approximately  the  same  dimension  everywhere  within 
the  textured  region. 

4.  The  parts  are  roughly  uniform  entities  having  approximately  the  same  dimension  everywhere  within 
the  textured  region. 

5.  A  region  in  an  image  has  a  constant  texture  if  a  set  of  local  statistics  or  other  properties  of  the  picture 
function  are  constant,  slowly  varying,  or  approximately  periodic. 
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B.  Mathematical  aspects  of  the  Variational  PDE  Models 


In  the  framework  of  the  local  Variational  PDE  Models,  the  approaches  are  proposed  with  rather  local 
based  measurements  (such  as  gradients)  for  the  smoothness  term.  This  is  quite  suitable  for  segmenting  objects 
with  constant  luminosity  characteristics.  However  for  the  cases  where  texture  is  present,  these  algorithms 
fail  to  output  a  good  quality  segmentation.  For  that  purpose  non-local  regularity  measurents  have  been 
proposed. 


B.l  The  Mumford-Shah  regularization 


We  recall  that  the  segmentation  problem,  considering  a  variational  approximation,  was  introduced  by 
David  Mumford  and  Jayant  Shah  in  1989  [19].  The  problem  addressed  by  the  authors  can  be  describe  as 
follows:  Given  and  image  I  C  M2,  we  want  to  decompose  the  image  in  a  finite  number  of  subsets  Ri  C  R2 
and  edges  AT.;,  such  that  at  the  interior  of  each  component  Ri  the  intensity  of  the  image  is  almost  constant 
and  the  edges  are  smooth. 

This  problem  corresponds  to  an  optimization  problem  of  given  and  image  it0,  where  w0  represents  the 
intensity  of  the  image,  to  find  a  function  u  and  a  curve  K,  which  minimizes  the  functional 


min  M S'fw,  K  |w0] 

K,u 


minH1(A') 

K,u 


;  /  |Vw|2  +  A  f  (u-  uo)2 
Jn  Jn 


where  A1  (AT)  corresponds  to  the  arc  length  of  K. 

To  solve  the  above  optimization  problem,  Luigi  Ambrosio  and  Vincenzo  Maria  Tortorelli  in  1990  (see 
Approximation  of  functionals  depending  on  jumps  by  elliptic  functionals  via  T-convergence,  Comm.  Pure 
Appl.  Math.,  43(8),  999-1036)  introduced  an  aproximated  proble  for  MS  functional.  This  approximated 
problem  have  several  advantages  from  theoretical  and  numerical  point  of  view,  for  details  of  the  presentation 
of  these  functional  we  refer  to  the  report  of  bibliographic  discussion. 

In  the  case  of  texture  segmentation,  we  want  to  use  a  similar  approach  to  the  one  for  the  classic  problem 
of  segmentation,  however  we  must  consider  the  aspect  of  the  periodicity  of  the  patterns,  that  is,  we  must 
consider  a  non  local  approach.  Nevertheless,  it  is  necessary  to  include  a  term  which  allow  us  to  compare 
texels  in  the  image,  and  for  that  reason  we  consider  a  functional  including  the  nonlocal  efects.  In  this  way,  we 
introduce  the  non-local  segmentation  with  the  variational  approach  Ge  :  A1(D)  x  A1(D)  — >  [0,  +oo],  defined 
by 


Ge(u,v ) 


< 


(V’O(z))  l|Vu,w(£)||5  +  \v{v(x))  +£||Vu(f)||2)^  dxN 


if  u,  v  G  H1( fl), 
and  0  <  v  <  1  a.e. 


+oo 


otherwise, 


where  V  :  [0, 1]  — >•  [0,  +oo)  be  a  continuous  function  vanishing  only  at  the  point  1,  and  let  if;  :  [0, 1]  — >  [0,  +oo) 
be  an  increasing  lower  semicontinuous  function  with  ^>(0)  =  0,  ip(  1)  =  1,  ip(t)  >  0  if  t  ^  0. 


B.2  The  non-local  means 

Recall  that  the  non  local  means  technique,  based  on  the  work  of  Buades  et  ah,  have  been  developed  in 
the  context  of  denoising  filtering. 


B.2.1  Basics 


The  idea  of  Buades  is  defined  as: 

Consider  a  continuous  noisy  image  uo  defined  on  D,  a  bounded  domain  of  M2  then,  the  filtered  image  in 
the  pixel  i  is  defined  by: 


NL(uq)(x)  =  u{x) 


1 

C{x) 


(Ga*|uo(*+-)-izo(v+0r)(O) 

A2 


uo(y)dy 


1 

W) 


w(x,y)u0(y)dy 
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where  G(x)  is  a  normalizing  factor  and  Ga(t)  is  a  Gaussian  Kernel  with  standard  deviation  a. 
notice  also  that: 


(Ga  *  \uo(x  +  •)  -  uo(y  +  -)|2)(0)  =  [  Ga(t)\u0(x  +  t)  -u0(y  +  t)\2dt 

Jr2 

In  the  discrete  case  (which  is  the  one  that  we  work  in  fact)  we  have  the  following  formulation: 

Consider  a  discrete  noisy  image  ug  =  {uo(i)/i  £  II},  in  this  context,  the  estimated  value  NL(u0)(i)  is 
computed  as  a  weighted  average  of  all  the  pixels  in  the  image,  i.e.  considering: 


NL(uo)(i)  =  u(i)  =  ^ -rzJ2w(i,j)u0{j) 

jen 

where  w(i,j)  must  be  a  sort  of  discretization  of  the  continuous  formula  described  before,  it  intends  to 
measure  the  similarity  between  pixels  i  and  j:  in  a  general  context  we  have  to  consider  w{i,j)  such  that: 


^ Zw(i’j )  =  1  0  <  w(i,j)  <  1 
jen 

The  (sort  of)  discretization  of  the  weight  function  is: 

1  _  ll«o(Ari)-u0(-Wj)ll! ,a 

w{i,j)  =  ^rre  ^ 

Z(i) 

where  Z(i)  is  a  normalization  factor,  ||  •  ||2,0denotes  the  Euclidean  weighted  distance  (by  a  Gaussian 
kernel  of  standard  deviation  a)  and  A /j  denotes  a  “neighborhood”  of  the  pixel  i,  centered  at  it,  usually  this 
neighborhood  is  a  square  of  length  2 to,  then  uo(A/i)  is  a  vector,  so,  the  idea  of  this  discretization  is  to  compare 
the  patch  centered  on  i  between  the  one  centered  on  j. 

The  important  element  here  is  the  weight  function  w(x,y)  used  in  the  formula  which  perform  the  “aver¬ 
aging”  term.  This  function  is  considered  in  a  different  context  to  perform  segmentation  (using  it  in  the  so 
called  non-local  gradient  or  weighted  gradient  as  we  seen  before).  It  is  important  to  follow  the  behaviour  of 
this  function  in  the  task  of  denoising,  specially  in  in  terms  of  the  pursued  main  characteristic  of  preservation 
of  textures  in  denoising  procedure.  Then,  is  important  to  have  an  implementation  of  this  algorithm,  in  order 
to  have  a  checkpoint  for  next  steps  of  this  work. 


B.2.2  Implementation  of  Non  local  means 

In  order  to  implement  the  denoising  algorithm  for  NL-means  we  have  to  try  to  reduce  the  number  of 
calculations,  as  we  seen  before,  the  original  filter  involves  a  high  number  of  calculations  (at  least  2 N2  for 
each  pixel  if  N  is  the  number  of  pixels  of  the  image).  In  order  to  do  that,  we  will  follow  the  “faster”  version 
of  NL-means  proposed  by  Buades,  which  try  to  reduce  calculations: 

Let  I  a  grid  of  pixels,  choose  a  subset  {*i , . . . ,  Zfc }  C  I  .  Consider  B  =  {i/||i||  <  m},  and  then  define: 
Wk  =  ik  +  B.  The  idea  is  to  divide  /  in  non-disconnected  regions  such  that:  I  =  U iWi  and  Wj  ft  Wj+i  ^  0. 
The  idea  is  to  define  NL-means  for  the  Wk  objects  (the  so  called  vectorial  NL-means)  and  then  defining  the 
NL-means  for  a  fixed  pixel  as  the  average  of  the  vectorial  NL-means  where  this  pixel  belongs. 

Let  us  define  the  vectorial  NL-means,  for  each  1 1 as: 


NL{Wk) 


—  ^2u0(Wj)e~ 


jei 


ll«o(W'*,)— «o(w-j)l  la 


where  Ck  is  a  normalization  parameter. 

Notice  that  in  this  case,  the  norm  involved  is  the  usual,  since  we  restore  at  the  same  time  the  whole 
neighborhood  and  do  not  want  to  give  any  privilege  to  any  point  in  particular. 

Finally,  in  order  to  restore  the  value  at  a  pixel  i,  we  must  consider  all  Wk  containing  i,  so,  if  we  define: 
Ai  =  {k  /  i  £  Wk},  we  have  to  define: 


NL{u0){i)  =  u(i)  =  ij-r  £  NL{Wk)(i) 

'  1 '  keAi 
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NL(u0)(i)  =  u(i)  =  ]4-t  NL(wk)(i) 

'  keAi 

We  attach  an  example  of  this  implementation  for  the  one  dimensional  case  in  numerical  examples  chapter, 
the  example  consists  on  a  signal  with  texture  (like  a  castle)  which  have  been  exposed  to  pseudo-random  noise, 
the  idea  is  to  check  if  this  implementation  preserves  the  most  important  setting  of  the  XI.  means  algorithm: 
remove  noise  without  removing  texture. 


B.3  Non-local  variational  image  models 

In  other  words,  if  we  have  a  patch  px(t)  =  f(x  +  t),\/t  £  [— r/2,r/2]2,  with  r  the  width  and  height  of  the 
patch.  The  L2  patch  distance  will  be: 

da(Px,Py)=  /  ga(t)\px(t) -py{t)\2dt 

J  n 

where  ga(t)  is  a  gaussian  with  variance  a. 

On  the  other  hand,  non-local  formulation  involves  the  use  of  a  weighted  gradient,  that  is  the  continuos 
equivalent  of  the  graph-cut  approach: 

\\7wu\2(x):=  /  (u(y)  —  u(x))2w(x,  y)dy 
Jn 

This  formulation  extends  in  a  quite  straight-forward  way  the  original  Mumford-Shah  model  to  a  non-local 
setting.  The  disadvantage  lies  on  the  extra  complexity  of  the  additional  integral  on  the  region.  Computa¬ 
tionally,  it  is  difficult  to  determine  the  weight  function  w(x,y)  for  each  specific  case.  Some  examples  for  this 
weight  function  are: 

•  Shift-invariant  function  such  as  a  gaussian  w{x,y)  =  ga{x  —  y)  one. 

•  The  patch  distance  da(px,Py)  as  previously  mentioned. 

•  The  NL-means  weight,  where  the  w(a :,y)  =  gh{da(px,py))- 


B.4  The  basic  model  for  obtaining  the  contour 

By  defining  the  square  of  the  module  of  the  non-local  gradient  as: 


V»«(x)  =  /  (u(x)  -  u(y))2w(x,y)dyN 

11  J  n 

we  denote  the  specific  shift  invariant  cases  as  w(x,y)  =  g(x  —  y). 

The  non-local  segmentation  with  the  variational  approach  we  have  the  formula 

Fnlg=  [  \\Vgu(x)\\2dxN 
Jn 

while  the  gradient  ( Gateaux  differential)  of  this  function  Fmlg  with  respect  to  function  u. 

dFQu°  =4I.9Il1w(/)-45*u(/), 

if  parity  of  the  function  g  is  imposed:  g(—x)  =  g(x),  and  *  is  the  convolution  /  *  g(x)  =  f{y)g{x  —  y)dyN . 
On  the  other  hand,  the  regularization  term  of  the  output  function  which  is  described  as 


FNLRegAT{u,v)  =  /  ||  Vffu(£)  || 2  V2 (x )da 

Jn 


,N 


has  a  Gateaux  differential  of 


dFNLRegAT(u,  v) 
du 


{d'wg}  =  2 u(k)g  *  v2(k)  —  2 g  *  ( uv2){k )  —  2 v2(k)g  *  u2(k)  +  2  \g\Ll  u{k)v2(k) 
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As  one  of  the  objectives  is  to  use  B-spline  functions  in  order  to  implement  these  procedures,  the  g(-) 
function  is  an  expanded  B-spline  of  order  n:  g(x)  =  /3n(x/m),  where  m  is  the  scaling  factor  of  expansion, 
where  integer  values  were  used  instead.  By  using  these  kind  of  functions,  it  is  possible  to  have  efficient 
convolutions,  since  b^x)  =  (3n(x/m)  =  bi*u!^(x),  where  u^{x)  =  ^ b *  •  •  •  *  b^x).  The  previous  function 

n  times 

may  be  regarded  as  applying  n  times  a  moving  average  process  of  window  width  rn.  One  characteristic  of 
these  families  of  fuctions  is  that  they  assimptotically  converge  to  a  gaussian  function  as  n  grows.  Gaussian 
functions  are  usually  used  since  many  effects  on  image  acquisition  such  as  bluring  or  other  distortions  may 
be  modelled  with  this  function. 


B.5  The  convergence  of  the  functional 

Let  us  consider  a  regularity  of  the  non-local  gradient,  given  by 


l|Vwu(*)||i=  [  (u(x)  -  u(y))2w(x,y)dyN  +  5\\Vu(x)\\2, 

Jn 

where  6  >  0,  the  function  w  £  Cl(Vt  x  O  \  {(&,  y)  :  x  =  y}),  with  w(x,  y )  =  w(y,  x)  and  satisfying 

limj|a?  -  y\\2w(x,y)  =  d0  >  0. 


L\n)  x  L^n)  ->  [0,+oo] 

if  u,  v  €  H1(fi), 
and  0  <  v  <  1  a.e. 

otherwise, 

where  V  :  [0, 1]  — >  [0,  +00)  be  a  continuous  function  vanishing  only  at  the  point  1,  and  let  ip  :  [0, 1]  [0,  +00) 

be  an  increasing  lower  semicontinuous  function  with  ^>(0)  =  0,  ^>(1)  =  1,  ip(t)  >  0  if  t  7^  0. 

We  want  to  compute  the  T-limit  of  the  Ge  :  L1(0)  x  L1(0)  — >  [0,  +00].  Following  the  ideas  of  Ambrosio- 
Tortorelli,  we  proposed  as  r-limit  as  the  functional  G  :  L1(0)  x  Ll(VL)  — >  [0, +00], 


We  introduce  the  non-local  segmentation  with  the  variational  approach  Ge  : 


Ge(u,v)  =  < 


ip{v(x))  ||Vwu(f)||5  +  ^V(v(x))  +£||Vn(f)||:2)  )  dx 


+00 


N 


G(u,  v) 


< 


\\Vwu(x)\\2sdxN +4cvHn~1(Su) 


if  -u  e  svB(n ), 

and  v  =  1  a.e. 


+00 


otherwise, 


with  cy  —  Jo  \JV ( s)ds .  The  difference  between  the  Ambrosio-Tortorelli’s  result  is  the  non-local  gradient  in 
the  above  expressions. 

This  class  of  result  says  us  that  when  we  consider  a  numerical  approximation  of  inf  Ge  for  e  small,  actually 
we  are  consider  an  approximation  of  inf  G. 
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C.  A  Stochastic- Variational  Model  for  Soft 
Mumford-Shah  Segmentation[20] 

C.l  Introduction  to  the  model. 

Given  an  image  I  in  L2(fl),  defined  in  O  C  R2,  which  we  assume  to  be  open,  bounded  and  smooth,  we 
know  that  the  goal  of  the  classical  segmentation  problem  is  to  find  a  closed  ‘edge  set’  T  and  all  the  connected 
components  f i  £  {1,  ...,AT}  associated  to  \  T,  based  in  some  desired  visual  measure  (e.g.  textures). 
Notice  that  with  this  process  the  image  /  is  discontinuous  along  V  and  relatively  soft  or  homogeneous  inside 
each  connected  component  f 

In  this  context  we  will  call  to  each  patch  p  =  I |n;  a  ‘pattern’  with  an  associated  support  f 
We  will  call  this  standard  procedure  a  ‘hard’  segmentation,  because  we  make  a  partition  of  Vt  based  on  the 
fixed  borders  contained  in  h.  giving  us  a  collection  of  disjoint  supports  flj,  so,  with  this  segmentation  we 
have: 

K 

M®)  =^lo,(a:) 

i= 1 

The  main  idea  of  the  soft  segmentation  is  make  a  ‘softer’  or  ‘weak’  partition  of  li,  in  the  following  (formally) 
sense: 

K 

In  (a:)  =  ^Pi{x) 

i=l 

Where  pi  are  a  continuous  (or  even  smooth)  functions,  formally  we  can  choose:  pi  =  lnt  *  pe  with  pe  a 
standard  regularization  function. 

We  can  connect  this  idea  with  the  widely  know  ‘mixture  image  models’  in  the  following  way: 

Suppose  that  the  image  /  could  be  decomposed  in  K  patterns  indexed  by  u>  £  {1, . . . ,  A'}.  Then,  each  pixel 
x  £  fl  could  be  assigned  to  a  pattern  by  the  ‘index  random  variable’  oj(x)  £  {1, . . . ,  K}.  In  this  context  we 
can  take  Pi(x)  =  P(w(:r)  =  i)  with  i  £  {1, . . . ,  A'},  this  is  the  natural  stochastic  interpretation,  also,  we  will 
call  each  pi  the  ‘ownership’  of  the  pattern  i. 

In  contrast  of  the  ‘hard’  segmentation,  this  ‘soft’  one  allows  to  each  pixel  belong  to  all  the  patterns  with  some 
probability,  then,  this  is  a  generalization  of  the  ‘hard’  segmentation,  and  allows  us  to  have  more  versatility 
with  some  natural  phenomena  when  we  usually  don’t  have  a  ‘clear’  boundaries,  instead  of  that  we  can  assign 
probabilities  in  the  transtition  regions. 

The  model  that  we  will  study  is  based  in  the  widely  known  Mumford-Shah,  which  is  based  in  the  following 
variational  problem: 

min  J[m,T|/]  =  min'H1(T)  +  a  /  |Vw|2  +  A  [  (u  —  I)2 
r-“  r>  Jn  J  n 

The  main  advantage  of  using  a  stochastic- variational  modelation  is  that  we  have  more  universality  in  modeling 
due  to  stochastic  approach,  and  also  we  have  a  rigorous  mathematical  analysis  and  better  computational 
methods  due  to  variational-PDE  approach. 

C.2  The  Soft  Mumford-Shah  (SMS)  Model 

Our  aim  is  minimize  the  energy  associated  to  K  patterns,  i.e.  we  want  to  minimize: 

JSMS[P,U\I }  =  \  J2  f  (ui~I)2Pr  +  I  |Vu|2  +  J2  I  (ve\Vpt\2+(Pl(1~P,))2 

i= i  i= i  j=i  Ja\  £ 

-  " - - - -  +  ' - - - '  +  " - - - 

—  ipSM  S  I  pSM  S  I  rpS  M  S 

r  sim  reg  r  MM 

with  the  constraint:  P  :  0  — >  Ak_i  this  is:  p.i  >  0,  i  £  {1,. . . ,  A'}  and  Y^f=iPi  =  1- 
Due  to  the  structure  of  this  energy  we  will  need:  pi  £  H1{ fl)  Vi.  Then,  under  the  supposition:  /  £  L2(fl) 
the  energy  will  be  well-defined  in  the  following  admissible  set: 

admK  :=  {(P,  U)  \  p^ut  £  H1^),  i  £  AT},  P  :  Q  — >  A  K_{\ 
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The  construction  of  this  energy  and  main  mathematical  results  about  it  can  be  seen  on  the  appendix. 


C.3  Euler  Lagrange  equations  for  the  SMS  Energy 

C.3.1  Euler  Lagrange  equations  in  the  (K  —  1)  simplex 

To  minimize  JSMS[P,  U\I]  we  can  use  a  gradient-descent  method  or  the  Euler  Lagrange  equations.  We 
will  compute  this  equations  and  bring  a  computational  scheme  to  solve  them. 

The  first-order  partial  variation  on  U  given  P  leads  to  the  following  system  of  PDEs,  for  i  £  {1, . . .  ,K}: 

=  0  in  (C.l) 

=  0  in  dn  (C.2) 

the  proof  of  this  is  standard  and  thus  omitted. 

Notice  that  the  first  equation  could  be  rewritten  in  the  form: 

- aAm  +  ( \pi}ui  =  (A  pi)  I 

which  is  a  Poisson  equation  with  variable  coefficients  (this  is  good  because  there  exists  numerical  methods 
to  solve  this  equation  efficiently) . 

In  the  case  of  the  first-orden  partial  variation  on  P  given  U  we  have  the  following  system  of  PDEs:  For 

i  e  K} 


aAui  +  A  (I  -  Ui)pi 
du.j 
dn 


Vi{x)  —  (V)(x)  =  0  in  (C.3) 

V{ (x)  —  ( v)(z )  =  0  in  dfl  (C.4) 

with  (w)  =  -C  Yh=i  wi ,  V  =  A (Ui  -  I)2  -  18eA p.t  +  f  (1  -p*)(l  -  2 p.^,  vt  = 

Details  are  given  in  the  appendix. 

Joining  this  results  we  have  the  following  theorem 

Theorem  C.3.1.  [Euler  Lagrange  Equations  of  SMS] 

We  have  the  following  system  of  equations:  for  i  £  {1, . . . ,  K} 

—  aAui  +  (A Pi)ui  =  ( Xpi)I  in  Q 

- 18sApi  +  2s~1pi(l  -  pi){  1  -  2 =  (V)  -  X(ui  -  I)2  in  D 

with  V  =  V(P,  U)  =  (Vi, . . . ,  Vk)  and  Neumann  condition  in  Ui  and  Vi . 

C.3. 2  Computation  of  Euler  Lagrange  equations 

To  solve  the  equations  of  the  last  theorem  we  will  use  a  algorithm  called  ‘alternating  minimization’  (AM) 
which  is  related  with  the  widely  known  algorithm  called  ‘expectation  maximization’  (EM)  used  in  statistics 
with  ‘hidden  variables’  [9] [13].  In  this  context  we  will  treat  pi  as  a  ‘hidden  variables’. 

AM  is  a  progressive  algorithm:  Given  the  actual  best  estimate  (t  =  n)  of  Un  =  (uf  \  i  =  1, . . . ,  K ),  we  solve: 

Pn  =  ar  grain  PJSMS[P\Un,I]  (C.7) 

or  equivalently: 

-18eA Pi  +  \{l  -  Pi)(  1  -  2Pi)  =  (Vn)  -  A«  ~  I)2  i  £  { 1 . K} 

£ 

with  Neumann  homogeneous  conditions. 

Then,  after  solving  this,  we  get  Pn,  we  get  Un+1  via: 

Un+1  =  argminuE[U\Pn ,  I ]  (C.8) 


(C.5) 

(C.6) 


DISTRIBUTION  A:  Distributing  approved  for  public  release 


or  equivalently: 


-«A m  +  (A p?)Ui  =  (A p?)I  i  €  {1, . . . ,  K} 
with  Neumann  homogeneous  conditions. 

Notice  that  this  last  equations  are  a  linear  decoupled  system,  then,  is  easy  to  solve.  The  hard  part  is  solve  the 
first  system  of  equations,  because  is  coupled  and  non-linear,  we  need  to  linearize  this  system  in  the  following 
way: 

Let  e.i(x)  :=  ( m(x )  —  I(x))2  and  e  =  (e,  |  i  £  {1, . . . ,  K}),  V  =  V{P,  U)  =  V(P,  e). 

We  want  to  solve: 

-18eA pi  +  -pi(  1  -  pi)(  1  -  2 =  ( V )  -  Ae.; 
e 

notice  that: 

1  K  x  K  o  K  2 

(V(p,e))  =  —  ^(-18eA Pi  +  Ae;  +  2e  1pi(l  - Pi){l  -  2 Pi))  =  —  J2^2pi  ~  3p^  + 

i=  1  i—1  i—1 

this  is  because  ^pi  =  1  and  A (X)Pi)  =  0 

Notice  also  that:  Pi(  1  -  Pi)(  1  -  2 pt)  =  Pi(l  -  pi)2  -  p2(  1  -  Pi). 

Then,  the  non  linear  PDE  can  be  solved  in  iterative  scheme: 

..._).  p(J)  p(i+1)  —>.... 


via  the  following  linealization: 

-  18sAp^  +  2£P?+1\l  ~pf)2  =  /f  (C.9) 

with  =  —A e*  +  {V{P^\e)  +  |(p^)2(l  —  p\^)  and  Neumann  homogeneous  condition. 

This  linear  system  of  Poisson  equations  can  be  solve  using  known  solvers.  The  theoric.  convergence  is  still  an 
open  problem. 

C.4  Numerical  Examples 

C.5  Appendix:  Mathematical  Facts 

C.5.1  Construction  of  the  Energy  for  The  Soft  Mumford-Shah  (SMS)  Model 

In  this  section  we  will  discuss  the  main  points  of  the  construction  of  the  SMS  Energy.  First  of  all,  let  K 
the  number  of  patterns  involved  (which  can  be  a  variable  of  the  model  to  be  found  optimally,  but  in  [21]  [22] 
they  proved  that  is  not  necessary).  Given  /  =  /( x)  an  image  in  fl  open,  bounded,  regular  domain,  our  first 
objective  is  to  compute  the  ownerships  p\(x), . . .  ,pk(x) 

Let  P(x)  =  (pi(x), . . .  ,pk(x ))  and  A/f-i  :=  conv(e . . . ,  ex)  where  conv  is  the  convex  hull  of  the  vectors  e) 
(The  canonical  basis  of  MA )  this  set  is  usually  called  the  canonical  ( K  —  1)  simplex  or  probability  simplex. 
Then  we  have  P  :  tt  — >  Ax-i- 

Associated  with  the  label  of  each  pattern  (ui  =  i ),  we  have  Ui(x)  £  iL1(12). 

Define  also  U(x)  =  (ui(x), . . . ,  uk(x)). 

Our  aim  is  estimate  the  optimum  pair  of  ownerships  and  patterns  given  /,  mathematically  we  want  the  pair 
(P*,  U*)  such  that: 

(P*,  U*)  =  argmaX(ptu)f‘(P1  U\I ) 

But,  due  to  the  Bayes  formula  and  taking  que  logaritmic  likelihood  (i.e.  using  J[-]  =  —  logP(-)  known  also 
as  the  Gibbs  Energy)  we  deduce  that  our  problem  will  have  the  following  form: 

argmiri(P’U)J[P ,  U\I]  =  argmin(ptu)J[I\P,  U]  +  J[P ]  +  J[U ] 

Now,  assuming  that  the  pattern  channels  have  independent  components  we  can  assume:  J[U\  =  J[u\, ,  Uk\  = 
EiLi  «^[wi|i]  this  is  motivated  by  the  fact  that  a  2D  image  is  a  projection  of  the  3D  environment  where  dif¬ 
ferent  objects  in  3D  are  located  independently  In  range  and  depth. 

Then,  for  patterns  in  if1,  we  can  impose  homogeneous  energies  (the  idea  is  penalize  changes  in  luminosity): 

J[Ui\i\  =  J[Ui }  =  a  |Vtti|2  ie{  1,  .  .  .  ,  K}  =  Freg[Ui] 

Jo. 
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Figure  C.l:  SMS  segmentation  with  three  (left)  and  four  phases  (right).  In  the  left  we  have  the  original 
image  and  the  three  ownership  distributions:  pi(x),p2{x),p3(x).  In  the  right  we  have  the  four  ownership 
distributions  pi(x),p2(x),p3(x)1p4(x).  Notice  that  the  ocean  pattern  is  “absorbed”  into  the  grass  pattern  in 
the  three  phases  segmentation  due  to  the  greenish  color  they  happen  to  share.  Also  notice  that  if  we  add 
more  patches  we  will  have  more  softly  segmented  patterns. 


Where  a  is  a  weight  to  model  the  ‘visual  sensitivity’  to  intensity  roughness.  Notice  that,  in  contrast  to 
original  Mumford-Shah  model,  the  energy  of  each  channel  is  defined  in  all  Q  instead  of  each  IV  This  is  a 
good  thing  for  the  mathematical  analysis  because  we  are  working  in  a  fixed  domain. 


Now,  for  the  term  J[I\P,U]  we  will  study  a  particular  case  (Gaussian  mixture  with  smooth  mean  fields). 
Assuming  that  all  patterns  are  gaussian  with  means  u\, . . . ,  uk ■  WLOG  we  can  assume  that  they  all  have 
the  same  variance  cr2,  then,  for  each  pixel  ie!l 

(I\oj(x)  =  i)  ~  N(ui(x),  a2)  i=  1, ...,  K 


Then,  due  to  ‘Total  probabilities’  property  and  applying  the  Gibbs  Energy  we  have: 

K 


J[I\P,U}  =  J^[I\P,U}  =  -n  I  \og(f2 g(I\ui(x),a)pi(x))  p>0 

i=l 


A  ''WAr 


Where  g  denotes  the  Normal  density  function  given  by:  g(I\m,a)  =  exp 
This  formula  is  complicated  in  general,  so  the  authors  made  the  following  reduction: 
Considering  each  ownership  Pi(x)  ~  so  we  have: 


K 


K 


-!°g  XA(/lMi(x)’CT)pi(x)J  ~  -1°S  ( 'Y^9{I\ui{x),a)lni{x) 

\i=l  /  \*=1  / 

K  K 

= -y]logg(/|Mi(a;),cr)lnAa;)  ~  -^2^°gg{I\'Ui{x),a)pi(x) 


K 


=  -  Y (I  —  Ui)2pi{x)  +  Constant(K ,  a) 

2a- 
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This  suggest  the  following  model: 


K 

J[I\P ,  U]  =  A  /  £(J  -  i^a))2!^*)  =  ^ 

"'n  *=i 


SMS 

sim 


Finally,  the  modelling  of  the  term  J[P ]  (i.e.  the  energy  model  for  the  ownerships)  will  be  done  considering 
the  following  conditions: 

•  Each  pattern  ownership  Pi(x)  have  at  most  two  phases:  ‘inside’  ( p^  =  1)  and  ‘outside’  (pi  =  0),  the 
transition  between  them  is  narrow 

•  The  boundaries  (transition  bands)  are  regular,  not  in  zig-zag 

With  this  considerations  we  will  use  the  following  energy,  due  to  Modica  and  Mortola  [18]  with  double 
potential:  Given  pi  £ 

Js\pi]=  [  9e\VPi\2+  (P<(1~P<))2  =F^S  ie{l,...,K}  6  «  1 

Notice  that  e  controls  the  transition  band,  and  by  this,  we  need  also  (to  minimize  the  energy)  that  p,;  takes 
only  0  or  1  values  (which  is  our  first  condition).  Also,  due  to  smallness  of  e  we  need  regular  pt  (if  not,  the 
gradient  part  explodes),  which  is  the  second  condition. 

This  energy  is  widely  studied,  in  particular  in  the  context  of  F-convergence  [17],  the  main  results  for  that 
studies  made  a  strong  link  between  SMS  and  classical  Mumford-Shah  functional,  as  we  will  see  in  the  following 
theorems:  Before  showing  the  theorems  we  will  need  a  few  definitions: 

Let  q  £  L1,  we  will  call  the  Total  Variation  (TV)  of  q  (as  a  Radon  measure)  to  the  following  formula: 


TV[q}:=  j  |V<?|  =  sup  (q,V  ■  <p) 

J  n  tp£C'(n,B2) 


We  also  define,  for  q  £  L1: 


ifg  =  0Vg  =  l,  a.e  in  H 
if  not 


Notice  that  by  this  definition  we  have:  if  Jo[<z]  =  +oo  then  q  has  only  two  phases,  then  Jo[q]  =  TV[q]  = 
Per(q~1(  1))  where  Per(q~1(  1))  denotes  the  perimeter  of  the  support  of  q 
Finally,  consider  the  following  subspace  of  T1(n): 


4,i]  W  :=  {*?  e  L1^)  |  q( x)  £  [0, 1]  \/x  £  11} 

After  this  definitions  we  can  enounce  the  big  theorems  about  the  Modica  and  Mortola  model: 

Theorem  C.5.1. 

\/q  £  Ll 0  1,  \  U1(ll)  let  Je[q\  =  +oo.  Then: 

Je  — >  Jq  in  the  sense  of  the  T-convergence  in  L^0  1](H) 

Theorem  C.5.2. 

Suppose  we  choose  a  subsequence  pe  in  an  optimal  way  (i.e.  they  converge  optimally  to  a  two-phase  pattern 
lv’(aj)  with  boundary  F  =  dV).  Then 


Je[Pe }  length(T)  =  f  \Dlv(x)\ 

Jn 

This  last  theorem  reveals  the  strong  connection  with  the  SMS  model  with  the  original  Mumford-Shah 
model. 
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C.5.2  Problems  and  Solutions:  Symmetry  and  ‘Weak  Supervision’ 

Let  Sk  the  permutation  group  of  {1  Given  a  A'-tuple  F  =  (fi, . . . ,  fx)  we  define:  Fa  := 

(/< r(i),  •  ■  • ,  fa(K))  where  er  g  SK- 

We  have  the  following  ‘problematic’  result,  which  leads  to  lose  uniqueness  (and  even  worse:  using  iterative 
algorithms  we  can  have  new  intermediate  solutions) 

Theorem  C.5.3.  [Hidden  Symmetry  of  SMS] 

Given  a  GSK  we  have:  JSMS[Pa,  Ua\I]  =  JSMS[P,  U\I] 

In  particular,  if  we  have:  ( P*,U *)  =  argmin(ptu)£admK  jSMS  U\I]  =>  Vct  £  Sk  {P„  ,  U*)  is  also  an  optimal  point 
The  proof  of  this  result  is  straightforward. 

The  obvious  question  is  how  to  eliminate  this  symmetry,  which  lead  us  to  lose  uniqueness,  the  following 
technique,  called  ‘weak  supervision’,  lead  us  to  ‘break  the  symmetry’: 

We  define  K  patches:  Q 1, . . . ,  Qk  and  then  we  impose  the  following  restriction: 

Pi\Qj  =  h3  e  {b 


where  Sij  is  the  Kronecker  delta. 


C.5.3  Existence  Results 


Theorem  C.5.4.  [Unsupervised  SMS] 

Suppose  that  I  £  L2(tt).  Then  given  any  positive  modeling  parameters  (A,  a,e),  a  minimizer  to  the  unsuper¬ 
vised  soft  Mumford-Shah  (SMS)  model  must  exist. 

Theorem  C.5.5.  [Supervised  SMS] 

Suppose  that  I  £  L2(Q).  Then  given  any  positive  modeling  parameters  (A,  a,  e),  a  minimizer  to  the  supervised 
soft  Mumford-Shah  (SMS)  model  must  exist,  assuming  that  each  patch  Qi  has  a  positive  Lebesgue  measure. 

The  proof  of  this  theorems  can  be  found  on  the  original  paper  of  Shen,  anyway  the  proof  is  relatively 
standard  except  for  one  technical  lemma. 


C.5.4  Details  on  the  deduction  of  Euler  Lagrange  equations  for  the  SMS  Energy 

The  first  calculate  the  first-order  partial  variation  on  U  given  P  is  standard. 

In  the  case  of  the  first-order  partial  variation  on  P  given  U  we  have  to  be  more  careful,  because  in  this  case 
we  have  the  constraint  P  :  SI  — >  A/v  i,  so,  the  derivative  must  take  values  on  the  tangent  subspace  of  Ax-i- 
To  do  this  we  will  follow  the  technique  developed  by  Chan  and  Shen  [3]: 

We  first  compute  the  first-orden  variation  in  P,  this  give  us  to: 


6Jsms  =  [  V  ■  SPdx  +  [  v  ■  SPdS 
Jn  Jon 

where  V  =  (Vi, . . . ,  Vk)  and  v  =  (v±, . . . ,  Vk)  and  V)  =  A (tq  —  I)2  —  18£Ap,;  +  2pi(  1  —  p[)(  1  —  2 p[)  defined  in 
and  Vi  =  18e|jEi  defined  in  <9f 1.  So,  the  first  partial  variation  without  the  constraint  in  P  is: 

dJ  t  r  I 

^-p  =  V|q  +  u|an 

Now,  because  P  £  Ak-i  we  have  to  project  appropriately,  for  this:  Let  TpAx-i  the  tangent  space  of  A^-i 
in  P  £  A^-_i,  and  let  7r  :  TpRK  — >  TpAx-i  the  ortogonal  projection  in  the  tangent  space.  Notice  that,  in 
this  case,  the  normal  direction  of  the  tangent  plane  is  given  by: 


Ik  _  (1,  ■  •  ■ ,  1) 

s/K  ~  \[K 


Then,  the  projection  operator  is  given  by: 


n(w)  =  w  — 


1  k(w,  1  k) 
K 


=  w  —  (w)1k 
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where  (w)  =  'K  Yh=i  wi- 

Finally,  the  restricted  gradient  of  E  in  Ax_i  is: 


dJSMS  fdJSMS\ 

~dp~~7r\~djp~) 


( V  -  (u)ljf)|n  -  (v-  (u)  lx)  I  an 


In  particular,  this  leads  to  the  Euler  Lagrange  equations  on  P :  For  i  £  {1, . . . ,  K } 


Vi{x)  —  (V)(x)  =  0  in  Q 

Vi(x)  —  ( v)(z )  =  0  in  dLl 


The  following  lemma  allows  us  to  take  an  homogeneous  Neumann  conditions: 


Lemma  C.5.6. 

Let  P  :  fl  — >  Ax-i-  Then  \/x  £  dfl  we  have:  (v)(x)  =  0 


Proof: 


The  second  equality  is  because  x  £  dfi 


18e  d 
K  dn 


18e  d 
K  dn 


(1)  =  0 
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D.  Nonlocal  Image  and  Movie  Denoising  -  A.  Buades,  B. 
Coll  and  J.M.  Morel  [1] 

D.l  Introduction:  Neighborhood  Filters  and  NL-means 

We  will  say  that  a  filter  (for  images  or  video)  is  a  neighborhood/NBH  filter  if  this  reduces  the  noise  by 
averaging  similar  pixels,  (anyway,  we  can  use  another  statistical  estimates  like  the  median). 

General  CCD  noise  models  (as  we  will  see  later)  are  signal  dependent.  Fortunately  two  pixels  which  received 
the  same  energy  from  the  outdoor  scene  undergo  the  same  kind  of  perturbations  and  therefore  have  the  same 
noise  model. 

We  will  accept  the  following  general  assumption  (which  is  the  basic  idea  where  this  models  relies): 
Assumption:  At  each  energy  level  the  noise  model  is  additive  and  white,  then  denoising  can  be  achieved  by 
first  finding  out  the  pixels  which  received  the  same  original  energy  and  then  averaging  their  observed  grey 
levels. 

Since  the  original  values  of  the  image  are  lost  the  filters  proceed  by  picking  for  each  pixel  i  the  set  of 
pixels  Mi  spacially  close  to  i  and  with  similar  grey  value.  The  NBH  filters  proceed  by  replacing  the  grey  level 
of  i  (which  will  be  denoted  u{i))  by  the  average  NF(u(i))  =  J2jeA f-  u(i)  (again,  we  can  use  median  also) 
Under  the  assumption  that  the  pixels  of  Mi  have  the  same  energy  as  i,  we  have  that  NF(u(i))  is  a  denoised 
version  of  u(i). 

Most  popular  NBH  filters  are:  a— filter  (Lee  1983),  SUSAN  (Smith,  Brady  1997)  and  the  bilateral  filter 
(Tomasi  and  Manduchi  1998)  where  the  neighborhoods  are  Gaussian  in  space  and  grey  level. 

In  contrast,  the  Non-local  means  (proposed  by  Buades  et  al.  in  2005)  filter  is  based  in  the  following  idea: 
Extend  the  concept  of  neighborhood  to  a  wide  class  which  they  called  non-local  mans  (NL-means).  This 
algorithms  class  defines  the  neighborhood  Mi  of  i  under  the  following  condition: 

j  £  Mi  the  grey  level  of  a  whole  window  around  jis  close  to  the  grey  level  of  the  window  around  i 

In  simple  words,  we  are  relaxing  the  spatial  constraint  of  the  classical  neighborhood  filters. 

D.1.1  Comparison  Principles 

A  systematic  comparison  of  the  huge  variety  of  denoising  methods  is  requested.  The  authors  consider 
that  visual  comparison  of  artificially  noisy  images  with  their  denoised  version  is  subjective  (which  is  a  usual 
technique  in  papers),  moreover  this  comparison  methods  depend  strongly  on  the  choice  of  the  image  and  do 
not  permit  to  address  the  main  issues:  the  loss  of  image  structure  in  noise  and  the  creation  of  artifacts. 

The  authors  propose  three  comparison  principles  aiming  at  more  objective  benchmarks. 

The  first  principle  asks  that  noise  and  only  noise  be  removed  from  an  image.  It  has  to  be  perceptually 
tested  directly  on  an  image  with  no  artificial  noise  added,  then  the  idea  is  compare  the  difference  between 
the  image  and  its  denoised  version  for  each  method.  We  will  call  this  difference  ‘method  noise’.  With  this 
comparison  method  is  much  easier  to  evaluate  whether  a  method  noise  contains  some  structure  removed  from 
the  image  or  not. 

The  second  principle,  which  we  will  call  ‘noise  to  noise’  relies  on  the  idea  that  a  denoising  algorithm  trans¬ 
forms  a  white  noise  into  a  white  noise.  This  may  be  seen  as  a  paradoxical  requirement,  but  it  is  a  good  way  to 
characterize  artifact-free  algorithms.  Also,  we  have  a  powerful  mathematical  tools  for  testing:  Mathematical 
analysis  and  Fourier  spectrum  testing. 

The  third  principle,  which  we  will  call  ‘statistical  optimality’  is  restricted  to  neighborhood  filters,  and  is 
based  to  question  if  a  given  NBH  filter  is  able  or  not  to  retrieve  faithfully  the  neighborhood  Mi  of  any  pixel 

i. 

As  we  will  see  later,  the  NL-means  is  the  one  with  the  best  results  on  this  principles. 
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D.2  Noise  Model 


In  this  section  we  study  barely  a  classic  model  for  noise,  this  is  the  model  for  the  CCD  devices  and  the 
main  result  of  this  model  is  an  hypothesis  that  will  made  the  NBH  filters  useful  for  denoising. 


In  CCD  devices,  we  have  three  kinds  of  noise: 

1.  Shot  Noise:  This  noise  is  proportional  to  the  square  root  of  the  number  of  incoming  photons  in  the 
captors  during  the  exposure  time,  namely: 


n0  =  \  ■  A  ■  ?y  =  CV$ 

V  hu 

where  <I>  is  the  light  power,  hv  the  photon  energy,  t  the  exposure  time,  A  the  pixel  area,  and  ?y  the 
quantum  efficiency.  Joining  all  constants  in  C  we  have  the  last  formula.  Where  $  can  be  understood 
as  the  true  image. 

2.  Dark  or  Obscurity  Noise:  We  will  denote  this  noise  as  ni,  is  due  to  spurious  photons  produced  by  the 
captor  itself.  We  can  assume  the  dark  noise  to  be  white,  additive  and  with  zero  mean. 

3.  Read  out  Noise:  We  will  denote  this  noise  as  n2,  is  another  electronic  additive  and  signal  independent 
noise.  Can  be  assumed  to  have  zero  mean. 


Also,  we  have  to  consider  another  correction,  a  “gamma”  correction:  a  nonlinear  increasing  contrast  change, 
we  will  denote  it  as  a  function  /  applied  to  the  noisy  image.  It  is  applied  as  an  internal  adjustment  in 
rendering  of  images  through  photography,  television  and  computer  imaging.  Usually  we  take:  /( x)  =  xa 
with  a  €  (0, 1) 

Summarizing  we  have: 

u(i)  =  /($(*)  +  c\/  $(i)  +  ni(i)  +  n2(i )) 

When  <h(z)  is  large  the  shot  noise  dominates  n\  and  the  signal  $(z)  dominates  n2,  this  let  us  to  expand 

u(i): 

u{i)  ~  /(<f>(i))  +  /'($(«))(CV$(*)  +  M*)  +  n2{i))  =:  /($(*))  +  n(i) 

If  instead  <f>(z)  is  small  with  respect  to  n\{i)  +  n2(z): 

n(i)  ~  u(i)  ~  +  n2(i)) 

in  the  particular  (and  interesting  case)  of  a  =  1/2  we  have: 


n{i) 


n0(i) 

\J  ni  (i)  +  n2(i) 


bright  parts  of  the  image 
dark  parts  of  the  image 


In  all  cases  the  noise  is  signal  dependent  but  independent  at  different  pixels. 


In  the  following  we  aim  at  recovering  /($(z)),  ie.  the  true  image  up  to  the  unknown  gamma  correction. 
The  approximations  we  made  for  u(i )  and  the  white  noise  and  independence  assumptions  on  no,ni,n2  legit¬ 
imate  the  following  important  hypothesis: 

Hyphotesis:  In  a  digital  image,  the  noise  model  at  each  pixel  i  only  depends  on  the  original  pixel  value  <I>(i) 
and  is  additive.  Let  A/j  the  set  of  pixels  with  the  same  original  value  as  i.  Then  £  A /j  are  independent 

and  identically  distributed,  (i.i.d.) 

This  hypothesis  leads  us  to  give  a  “proof”  of  the  correctness  of  NBH  (and  NI.  means;  algorithms: 

Given  a  pixel  i ,  let  j  £  Af{  all  the  pixels  that  follow  the  same  model  of  i,  i.e.  Vj  £  Afi  :  u(j)  =  v(i)  +  n(j), 
where  v(i)  is  a  deterministic  function,  n(j)  are  i.i.d.  noise.  Then,  considering  the  denoising  filter: 

NF(u(i))  :=  tttj  J2  u(j) 

|7Vj  |  jeW 
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Thanks  to  the  hypothesis  and  variance  formula  for  independent  variables  we  have: 

NF(u(i))  =  v{i)  +  h(i) 

where: 


Var(n(i))  =  y— —Var(n(i))  <  Var(n(i)) 

|A/i| 


i.e.  these  filters  reduce  the  variance  of  the  residual  noise. 


D.3  General  Neighborhood  Filters 

D.3.1  Local  NBH  Filters 


We  will  present  these  filters  in  order  or  complexity: 

The  first  one,  and  then  the  most  “primitive”,  is  based  in  replacing  the  color  of  a  pixel  with  an  average  of  the 
nearby  pixels  colors,  i.e.  Mi  is  just  a  spatial  neighborhood.  The  filtered  value  for  the  pixel  x  is: 

1  f  _  l*-vl2 

Mp(u(x))  =  — K  e  P2  u(y)dy 

7T  P  J  R2 

Where  p  is  a  parameter  that  controls  (roughly)  the  size  of  the  neighborhood.  The  problem  of  this  filter  relies 
on  the  case  when  a  spatially  closer  pixels  of  the  pixel  i  don’t  have  similar  colors  as  i.  When  the  color  is 
replaced  by  an  average  of  very  distinct  colors  it  produces  blurring  in  the  border  of  the  transition  of  colors. 
This  suggest  the  needness  of  a  model  which  includes  a  weight  to  discard  closer  but  ‘too  much’  different  pixels 
for  the  averaging,  this  is  the  idea  for  the  Sigma  filter  (Lee  1983,  Yaroslavsky  1985):  Average  neighboring 
pixels  which  also  have  a  similar  color  value,  the  filtered  value  is  given  by: 

NFh,p(u(x))  =  -^-r  /  e  ^  u{y)dy 

JBp(x) 


Only  pixels  inside  Bp(x )  are  averaged,  h  controls  the  color  similarity  (is,  roughly  speaking,  the  tolerance  for 
the  color  similarity)  and  C(x)  a  is  normalization  factor. 

Later,  to  avoid  dependence  of  a  “Ball  Neighborhoods”  we  have  the  filters  SUSAN  (Smith  and  Brady  1997)  and 
bilateral  (Tomasi  and  Manduchi  1998)  where  the  ball  neighborhoods  are  replaced  by  exponential  penalization 
on  space,  ie.  we  have  “bilateral”  Gaussian  depending  on  both  space  and  grey  level: 


SNFhtP(u(x)) 


1 

C{x) 


_  \*~y\2  \u(x)  —  u(y)\2 

e  p  ■  e  fc3  u(y)dy 


Another  way  to  avoid  the  blurring  of  the  spatial  filtering  A4P  is  by  a  statistical  correction  due  to  Lee  in  1980: 


LMp{u{x)) 


Mp(u(x))  + 


(u{x)  -  Mp(u(x))) 


where 

=  max  (0,  — ry  [  e~  — — ^-(u(y)  -  Mp(u(x)))2dy  -  a2 J 

V  npz  JR 2  p  J 

The  idea  of  this  correction  is  based  on  the  following  observation:  When  the  Gaussian  mean  is  performed  on  an 
edge,  the  variance  of  the  performed  mean  can  become  larger  than  the  variance  of  the  noise,  this  phenomena 
is  not  desired,  and  the  correction  tries  to  avoid  this. 

Bilateral  filters  anyway  have  a  better  performance  than  Lee’s  correction. 

Notice  that  we  can  replace  the  mean  operation  by  nonlinear  operator  like  the  median  filter.  Tukey  in  1997 
gives  a  median  filter,  this  chooses  the  median  value,  that  is,  the  value  which  has  exactly  the  same  number  of 
grey  level  values  above  and  below  in  a  fixed  neighborhood.  It  is  equivalent  to  an  average  of  the  pixels  in  a 
direction  orthogonal  to  the  gradient,  that  is  to  an  anisotropic  diffusion  or  mean  curvature  motion. 

A  small  comparison  of  this  neighborhood  filters  that  can  be  seen  on  the  original  paper  gives  us  non  fully 
acceptable  results:  Gaussian  filtering  don’t  maintain  sharp  edges,  anisotropic  filter  removes  small  details  and 
fine  structures,  Lee’s  statistical  filter  leave  some  areas  untouched  (then  noisy),  sigma  and  bilateral  creates 
irregularities  on  the  edges.  This  comparison  make  the  needness  of  consider  a  new  model. 
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D.3.2  Non  Local  Averaging 

As  we  said  before,  the  main  idea  of  this  model  is  based  on  the  simply  observation  that  the  most  similar 
pixels  to  a  given  pixel  have  no  reason  to  be  close  of  it  (for  example  in  periodic  patterns),  then  the  idea  is  to 
construct  a  filter  which  consider  pixels  with  neighborhoods  with  similar  average  values  as  the  neighborhood 
of  the  original  pixel  (then,  we  don’t  have  spatial  constraint).  Then,  the  proposed  formula  is: 

, r r  /  ,  u  1  /'  f  gp  *  \u(x  +  •)  -  u(y  +  -)|2(0)  \  ,  w 

nl = ~c{x)  {  #  ry)dy 

where  gp  is  a  Gaussian  kernel  with  standard  deviation  p ,  C(x)  is  the  normalizing  factor,  h  acts  as  a  filtering 
parameter  and: 

gp*\u(x  +  -)-u(y  +  -)\2{0)  =  /  gp(t)\u(x  +  t)  -  u{y  +  t)\2dt 

Jr2 

This  last  formula  reveals  the  most  important  characteristic  of  this  filter,  NL  replace  the  value  of  u(x)  by  a 
weighted  mean  of  u(y).  The  weights  is  relevant  only  if  a  Gaussian  window  around  y  is  similar  to  the  same 
window  around  x.  This  is  the  concept  of  self-similarity. 

NL-means  works  great  with  text  images,  but  is  limited  when  an  image  have  structured  noise  (like  JPEG 
compression),  in  that  case  NL  lose  details.  More  specific  information  can  be  seen  on  the  original  paper. 

D.4  Principles  for  Denoising  Algorithms  Evaluation 

We  will  enounce  the  formal  assertions  for  this  principles  that  we  mentioned  before  in  the  introduction. 

D.4.1  Method  Noise 

As  we  said  before,  the  idea  of  this  principle  is  to  evaluate  if  an  algorithm  just  removes  noise,  or  it  also 
removes  some  structure  of  the  image. 

Definition:  Let  u  be  a  (not  necessarily  noisy)  image  and  D/,  a  denoising  operator  depending  on  h.  The 
method  noise  of  u  is  the  image  difference: 


n(Dh,u)  =  u  -  Dh(u) 


and  the  formal  principle  will  be: 

Principle  1:  For  every  denoising  algorithm,  the  method  noise  must  be  zero  if  the  image  contains  no  noise, 
and  should  be  in  general  an  image  of  independent  zero-mean  random  variables. 

Examples  of  the  evaluation  of  algorithms  under  this  principle  can  be  seen  on  the  original  paper,  anyway, 
roughly  speaking,  the  NL-means  have  the  best  results  for  this  principle. 

D.4. 2  Noise  to  Noise  Principle 

As  we  said  before,  the  idea  of  this  principle  is;  accepting  that  no  algorithm  can  remove  all  the  noise 
from  an  image,  at  least  we  want  to  transform  noise  in  noise  with  less  variance.  This  is  a  way  to  check  if  an 
algorithm  reduces  the  noise,  and  also  don’t  create  artifacts  on  images. 

Principle  2:  A  denoising  algorithm  must  transform  a  white  noisy  image  into  a  white  noisy  image  (with 
lower  variance). 

As  we  said  before,  this  principle  have  a  good  way  to  be  checked:  Studying  the  Fourier  transform  of  denoised 
image,  because  we  know  that  the  Fourier  Transform  of  a  white  Gaussian  noise  is  a  white  Gaussian  noise, 
so,  visualizing  the  FT  of  the  denoised  image  we  will  see  if  it  remain  as  a  white  Gaussian  noise,  or  it  have 
changed  in  wrong  way  (creating  artifacts). 

Several  algorithms  have  been  checked  with  this  principle  (that  can  be  seen  on  the  original  paper),  and  bilateral 
filters  and  NL-means  report  the  best  results. 

D.4. 3  Statistical  Optimality 

We  will  understand  statistical  optimality  as  the  ability  of  a  generalized  neighborhood  filter  to  find  the 
right  set  of  pixels  Mi  for  performing  the  average  yielding  the  new  estimate  for  u(i),  obviously  this  principle 
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Figure  D.l:  Comparison  of  neighborhood  filters.  From  top  to  bottom  and  left  to  right:  noisy  image  (with 
gaussian  noise  with  a  =  15),  Gaussian  filtering,  anisotropic  filtering,  Lee’s  statistical  filter,  sigma  or  bilateral 
filter  and  the  NL-means  algorithm.  All  methods  except  the  Gaussian  filtering  maintain  sharp  edges.  However, 
the  anisotropic  filtering  removes  small  details  and  fine  structures.  These  features  are  nearly  untouched  by 
Lee’s  statistical  filter  and  therefore  completely  noisy.  The  comparison  of  noisy  grey  level  values  by  the  sigma 
or  bilateral  filter  is  not  so  robust  and  irregularities  are  created  on  the  edges.  The  NL-means  better  cleans 
the  edges  without  losing  too  many  fine  structures  and  details. 
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will  be  useful  just  for  neighborhood  (or  averaging  en  general)  methods. 

Principle  3:  A  generalized  neighborhood  filter  is  optimal  if  it  finds  for  each  pixel  i  all  and  only  the  pixels  j 
having  the  same  model  as  i.  Obviously  is  impossible  to  check  if  the  pixels  in  Mi  satisfy  $(j)  =  <f>(*),  then  in 
general  this  condition  is  relaxed  to  check  if  the  pixels  j  are  likely  to  have  the  same  value  as  i.  Examples  are 
given  in  the  original  paper  (anyway,  this  principle  is  more  useful  for  movie  denoising). 


D.5  Numerical  Examples 


Figure  D.2:  Method  noise  experiment  on  Lena  (gray  levels  only).  From  top  to  bottom  and  left  to  right: 
original  image,  Gaussian  mean,  mean  curvature  motion,  total  variation  minimization,  translation  invariant 
soft  and  hard  thresholding,  bilateral  filter  and  XI.  means 
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Figure  D.3:  Noise  to  noise  principle:  Upper  images:  Application  of  the  denoising  algorithms  to  a  noise  sample. 
From  left  to  right  and  top  to  bottom:  noise  sample,  filtered  noise  by  the  Gaussian  filtering,  total  variation 
minimization,  hard  wavelet  thresholding,  bilateral  filter  and  the  NL-means  algorithm.  The  parameters  of  each 
algorithm  have  been  tuned  in  order  to  have  a  filtered  noise  of  standard  deviation  2.5.  For  the  neighborhood 
or  bilateral  filter  the  research  zone  has  been  fixed  to  21  x  21  and  for  NL-means  we  have  used  the  whole  image. 
Therefore,  only  the  h  parameter  has  been  tuned  in  order  to  obtain  the  desired  standard  deviation  Lower 
Images:  Noise  to  Noise  principle:  Fourier  transforms  of  the  filtered  noises  displayed  in  the  upper  images. 
The  Fourier  transform  of  a  Gaussian  white  noise  is  a  Gaussian  white  noise 


DISTRIBUTION  A:  Distributing  approved  for  public  release 


E.  Nonlocal  Linear  Image  Regularization  and  Supervised 
Segmentation  -  G.  Gilboa,  S.  Osher 


DISTRIBUTION  A:  Distributing  approved  for  public  release 


Nonlocal  Linear  Image  Regularization  and  Supervised 
Segmentation  -  G.  Gilboa,  S.  Osher  [7] 


March  11,  2013 


DISTRIBUTION  A:  Distribution  approved  for  public  release 


E.l  Introduction:  Objective,  Problems  on  NL-means  and  Varia¬ 
tional  Point  of  view 

The  main  objective  of  this  paper  is  to  give  an  unified  approach  to  both  denoising  and  segmentation  tasks 
using  nonlocal  functionals  and  their  respective  nonlocal  evolutions. 

Recall  the  NL-means  filter  proposed  by  Buades,  Coll  and  Morel  in  [1] 

NL(u(x))  =  — [  e~da{u{x)'u(v))lh2 u{y)dy 
C(x)  Jn 

where 

da(u(x),u{y))  =  /  ga(t)\u(x  +  t)  -  u{y  +  t)\2dt 
Jn 

where  ga(t)  is  a  gaussian  with  variance  a. 

The  normalization  used  here  (performed  by  C{x))  does  not  guarantee  that  the  mean  value  of  the  filtered 
image  u  is  the  same  as  the  mean  value  of  the  input  image,  this  is  not  desired  (for  example  when  we  have 
Gaussian  white  noise).  Also,  normalizing  in  this  manner  introduces  some  bias  from  the  original  distance 
between  points  with  many  similar  regions  (high  C(x))  to  more  rare  and  singular  points  (low  C(x)).  Dividing 
by  C{ x)  tends  to  diminish  this  distinction.  The  authors  will  show  a  different  normaliation,  which  retains 
symmetric  similarities  between  points,  preserve  the  mean  value  and  does  not  tend  to  blur  singular  regions. 
They  believe  this  may  explain,  in  part,  why  their  proposed  iterative  process  outperforms  the  original  filter. 

Kindermann-Osher- Jones  interpreted  the  NL-means  (and  NBH  filters  in  general)  as  regularizations  based 
on  nonlocal  functionals  in  the  general  form: 

Jkoj{u)  ■  =  f  f  f  ^  w^x  _  y\)dxdy 

Particular  cases  are  the  Yaroslavsky  functional  Jyor  where  /  =  \  _  exp  or  the 

NL-means  functional  Jbcm  where  f  =  1  —  exp  fLbdDXy)) ''j 

The  idea  is  to  solve  a  minimization  problem  using  the  above  functionals  to  have  a  solution  (which  will  be  the 
filtered  image).  Notice  that  we  have  tu(|a:  —  y |),  in  this  cases  this  represents  a  simple  symmetric  window  and 
/(•)  is  the  most  important  part  of  the  regularization.  The  main  problem  is  that  in  general  this  functionals 
are  not  convex. 

The  authors  follow  this  approach,  but  simplified  to  a  quadratic  functional  by  changing  the  roles  of  /  and  w. 

E.2  Regularization  Functional 

The  authors  consider  the  following  regularization  functional: 

Freg(u)  :=  \  I  (u{ x)  -  u(y))2w(x,y)dxdy 

4  J  fix!) 

This  could  be  considered  for  fl  C  R”,  and  for  images  for  n  =  2.  The  weight  function  w(x,y)  is  positive 
(w(x,y)  <  0)  and  symmetric:  w(x,  y)  =  w(y,x). 

For  our  purposes  w  is  based  on  image  features  and  can  be  understood  as  the  proximity  between  two  points 
x  and  y ,  based  on  features  in  their  neighborhood.  Notice  the  difference  between  the  functional  given  before, 
the  role  of  w(x ,  y)  is  much  more  important  now,  it  basically  determinines  the  regularization. 

Notice  the  corresopnding  Euler-Lagrange  descent  flow  is: 

ut(x)  =  -F'reg(u)(x)  =  -  j^{u{x)  -  u(y))w(x,  y)dy  (E.l) 

Define  now  the  following  linear  operator: 

Lu(x)=  /  (u(y)  —  u(x))w(x,y)dy  (E.2) 

Jn 
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Notice  that  L  can  be  seen  as  the  continuous  version  of  the  graph  Laplacian  (where  the  undirected  graph 
G  =  (V,  E)  have  weights  in  every  arc).  We  will  see  that  L  have  similar  properties  with  the  operator  -y(c(x)V) 
where  c(x)  is  a  symmetric  matrix,  c(x )  >  0. 

Assuming  our  initial  condition  as  the  initial  input  image  /,  we  can  write  the  descent  flow  as  the  following 
linear  (non  local)  diffusion  problem: 


ut{x)  =  Lu(x),  u\t=o  =  f(x) 


(E.3) 


E.3  Variational  Denoising 

One  can  add  a  convex  fidelity  term  to  the  convex  functional  J.  For  the  L 2  fidelity  we  can  consider: 

J(u,f)  =  Freg(u)  +  Fsim(u)  =  Freg(u )  +  ~  |  |tr  /||  \  (E.4) 

then,  u  satisfies  the  Euler  Lagrange'  equation: 


—  Lu  +  A  (u  —  /)  =  0 


(E.5) 


This  can  be  seen  as  a  constrained  problem: 

u  :=  argminFreg(u),  s.t.  \\u  -  f\\l  =  \fl\al 

where  o2  is  the  variance  of  an  additive  noise  in  a  noisy  image  /.  Then  A  can  be  viewed  as  a  Lagrange 
multiplier  and  one  can  compute  the  constrained  problem  by  initializing  e.g.  with  it|t=0  =  /,  A  =  1  and 
iterating: 

Ut  =  Lu  +  A  (/  —  u) 

X  =  Tn f  (u  ~  f)Ludx 

MK  Jn 


E.4  Multichannel  signals 


For  the  case  when  we  have  a  multichannel  signals:  let  f(x)  =  (/\  f2, . . . ,  fM){x)  be  a  M  channel  signal. 
A  multi-valued  affinity  function  (as  we  will  see  later)  is  used  to  compute  w(x,y )  based  on  /  ( w(x,y )  will  be 
the  same  for  all  channels).  Let  u(x)  =  (it1, . . .  ,uM)(x)  be  the  regularized  signal.  The  regularizing  functional 
in  this  case  is: 


M 


rpmc 
r  reg 


(u)  :=  7  ~  uZ(x)fw(x,  y)dy 


/ox  n 


(E.6) 


and  the  multi-channel  evolution  for  each  channel  ul  is: 


u\(x)  =  f  ( u\y )  -  ul(x))w(x,y)dy,  ul |t=0  =  f 
J  o 


(E.7) 


E.5  Properties  of  L 

The  following  properties  of  the  linear  operator  L  leads  to  establish  some  results  for  the  flow  (E.3)  and  for 
the  variational  problem  (E.5) 

Proposition  E.5.1.  The  linear  operator  L  defined  by  ( E.2 )  satisfies  the  following  properties: 

1.  If  u(x)  =  const  then  Lu{x)  =  0.  For  w(x,  y)  >  0,  Vx,  y  £  Cl,  if  Lu{x)  =  0  then  u(x)  =  const 

2.  Let  u(xq )  >  u(x),  Vx  €  Q,  then  Lu(xo)  <  0.  For  the  minimum  let  u{xf)  <  u(x)  then  Lu[x\)  >  0. 

3.  —L  is  a  positive  semidefinite  operator  i.e.  (— Lu(x),  u(x))  >  0  where  we  consider  the  L2  inner  product. 

4.  JnLu(x)  =  0 


DISTRIBUTION  A:  Distributicgn  approved  for  public  release 


We  will  need  an  additional  technical  condition  on  the  weight  function.  The  idea  of  this  condition  is  that 
w(x,y)  can  have  zero  values,  but  we  need  a  certain  level  of  connectivity,  in  the  context  of  that  there  will  not 
be  any  disjoint  regions  where  no  information  is  exchange  between  them  throughout  the  evolution.  Consider: 

— Lhas  a  zero  eigenvalue  of  multiplicity  1.  (E.8) 

This  condition  is  equivalent  to  state  that  —  L  has  only  a  constant  function  in  its  null-space.  In  graphs,  this 
condition  is  equivalent  to  a  connected  graph  (in  the  sense  that  we  stated  before)  when  the  linear  operator  is 
the  graph  Laplacian.  In  our  case  the  specific  relation  is  given  by  the  following  lemma: 

Lemma  E.5.2.  Condition  (E.8)  holds  if  and  only  if  for  any  two  points  x,y  there  exists  a  sequence  of  points: 
Z\, . . . ,  Zk  such  that  w(x,  z\)  •  . . .  •  w(zk,  y)  >  0 

With  this  technical  condition  and  the  properties  of  the  linear  operator  L  we  can  prove  some  important 
properties  of  the  flow  (E.3): 

Proposition  E.5.3.  The  flow  defined  by  (E.3)  satisfies: 

1.  Preservation  of  mean  value: 

w\Lu{x't)dx=kLl(x)dx 

2.  The  extremum  principle  holds 

min  f(x)  <  u(x,t)  <  max  /(x)  Vx  €  fi,Vi  >  0 


3.  For  w(x,  y)  which  admits  condition  (E.8),  the  solution  converges  to  a  constant 

u(x,t  — ►  oo )  =  const  =  t-=-7  I  f(x)dx 

l“l  J o 

4-  The  following  estimate  holds: 

\jtMx^  HI-0 

Notice  that  from  properties  (i)  and  (iv)  we  have  4zvar(u(t))  <  0,  where  var(u )  is  the  (empirical)  variance 
of  u,  defined  by: 

var(u)  :=  P|  ^  ^0*0  -  J  u(y)dySj  dx 

Similar  properties  can  be  found  for  the  variational  formulation  (E.5): 

Proposition  E.5. 4.  The  minimizer  ux  of  (E.5)  admits  the  following  properties: 

1.  Preservation  of  mean  value: 

w\LuX{x)dx=w\Lf[x)dx'  va-° 

2.  The  extremum  principle  holds 

min  f(x)  <  ux(x)  <  rna x/(x)  Vx  G  O,  VA  >  0 

X  X 

3.  For  w(x,y)  which  admits  condition  (E.8),  the  solution  converges  to  a  constant  when  A  — >  0 

lim  ux(x)  =  const  =  — —  f  f(x)dx 

w  |fi|  Jn 

4-  The  following  estimate  holds: 
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E.6  Weights  based  on  affinity  functions 

As  we  seen  before,  the  weights  w(x,y)  will  define  the  regularization  induced  by  the  functional  J(u).  The 
idea  is  to  consider  a  basic  affinity  structure  as  the  similarity  between  image  features.  For  a  point  x  G  R2 
we  will  assign  a  feature  vector  denoted  by  Ff(x )  (a  image  feature  could  be  grey  level  value,  edge  indicator, 
dominant  direction,  dominant  frequency,  etc.)  We  will  consider  also  the  region  5 lw(x)  C  51  as  the  points  y 
where  w(x,  y)  >  0.  Due  to  symmetry  of  w(x,  y)  we  will  have  the  symmetry  of  this  sets  in  the  following  sense: 
y  G  £lw(x)  x  G  51u,  (y). 

Then,  in  general  we  will  have  the  general  weight  function  based  on  affinities: 

w(x  y)  =  |  h(Ff(x),  Ff(y))  y  G  51w(a:) 

1  0  otherwise 

where  ft.(si,S2)  is  a  similarity  function  with  the  following  properties: 

•  Positive:  h(si,S2)  >0 

•  Symmetric:  h(s\,  S2)  =  h(s2,  Si) 

•  Bounded:  h(s\,S2)  <  M  <  00 

•  Maximal  at  equality:  fi(si,si)  >  /i(si,S2)  Vsi,S2 

Then,  in  general,  for  features  in  a  suitable  Banach  space,  with  a  norm  || 
function  is: 

/i(si,s2)  =e-(l|si"S2|l/m)P 

where  m  is  a  threshold  parameter.  The  power  p  >  1,  in  general  is  set  to  p 

E.6.1  Weights  examples 

Intensity,  local 

h(Ff(x),  Ff(y))  =  e-^F^x)~Ff(y)\/h)2 

Ff(x)  =  /O) 

ttw{x)  =  {y  G  51  |  \y  -  x\  <  Acc} 

where  Aa:  is  the  grid  size.  This  results  in  a  4  nearest  neighbors  discretization. 

Intensity,  weighted,  semi-local: 

h(Ff(x),Ff(y))  =  e-^Ff^}-Ff^/h)2e-\x-y\2/^ 

Ff(x)  =  f(x) 

51  u;  (®)  =  {y  G  51  I  \y  -  x\  <  r} 

where  ad  controls  the  spatial  decay  and  r  is  the  window  radius  (should  be  in  order  of  ad) 

For  textures,  let  . . . ,  K M (x)  be  M  linear  filters  of  different  directions  and  frequencies.  Let  vl  :=  u*Kl. 

The  weights  can  be  computed  by: 

h(Ff(x),  Ff(y))  =  e-UFf(x)-Ff(y) \/hf 
Ff(x)  =  (■ v\...,vM){x ) 

ttw(x)  =  {y  G  51  I  \y  -  x\  <  r} 

The  nonlocal  version  of  Yaroslavsky  affinity: 

h(Ff(x),  Ff(y))  =  e-UFfW-Ff(.y)\/h)2 
Ff{x)  =  f(x) 

51 w(x )  =  51 


•  ||  a  typical  choice  for  similarity 

(E.10) 

=  2  in  case  of  Euclidean  norm. 
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XI.  means  affinity: 


e-(\Ff(X)-Ff(y)\/h)2 

f(x)  £  Bx  where  Bxis  a  patch  centered  at  x 

n 


h(Ff(x),Ff(y))  = 
Ff(x)  = 
(x')  — 


E.7  Discretization 


We  will  require  that  the  weights  are  sparse  enough,  this  will  made  the  complexity  of  the  algorithm  linear. 
This  constraint  is  usually  not  very  limitating  since  in  the  general  case  if  we  have  a  large  window  many 
connections  have  very  low  weight  values  which  can  be  set  to  zero.  If  there  are  many  connections  with  high 
weight  values  we  suggest  to  take  randomly  only  a  part  of  them.  The  iterative  process  can  usually  compensate 
this  random  choice. 

Let  Uk  the  value  of  the  pixel  k  in  the  image  (1  <  k  <  N),  Wj-i  is  the  sparsely  discrete  version  of  w(x,y). 
Recall  the  neighborhood  notation:  l  £  A4  =  {l  :  w  :  kl  >  0}.  The  flow  (E.3)  is  implemented  by  the  explicit 
in  time  forward  Euler  approximation: 

u-+1=unk+At  £  wkl(u?-uV  (E.ll) 

?eA4 

where  u jj  =  Uk(nAt)  All  the  coeficients  on  the  right  side  are  nonnegative  if: 


1  >  At  ^2  Wkl 


(E.  12) 


This  is  the  well  known  CFL  restriction  on  the  time  step  At.  This  leads  to  maximum  norm  stability,  in  fact  a 
maximum  principle,  for  this  approximation  to  (E.3)  For  the  problem  with  fidelity  term,  we  have  the  following 
discretization: 

<+1  =  ul  +  At  Y,  «*«(«?  -  <)  +  AA t(fk  -  ul)  (E.13) 

i^Nk 

and  this  implies  the  analog  CFL  condition: 


(E.ll) 


E.8  Computing  weights  for  non  local  means 

The  most  important  (and  demanding  in  computational  terms)  part  of  the  method  is  the  approximation 
of  w(x,y)  into  a  sparse  discrete  version  Wki  (where  we  need  the  conditions  established  before).  The  authors 
propose  two  algorithms:  the  first,  a  semilocal  one,  is  proposed  for  denoising  purposes;  the  second,  a  fully 
nonlocal  (with  random  choices  instead  of  checking  all  the  possibilities)  is  intended  for  nonlocal  segmentation. 

E.8. 0.1  Semi-local  version 
Algorithm.  For  each  pixel  k : 

1.  1)  Compute  the  similarity  of  all  the  patches  in  the  window  (Authors  use  5x5  patch  Bx  and  11  x  11 
window  0„,).  Construct  A4  by  taking  the  m  (Authors  use  m  =  5)  most  similar  and  the  four  nearest 
neighbors  (for  conexity  condition)  of  the  pixel. 

2.  2)  Compute  the  weights  wu,  l  £  A4  using  the  desired  weight  function  and  set  to  zero  all  the  other 
connections,  (i.e.  Wki  =  0,  l  A4 

3.  3)  Set  wik  =  Wki  (symmetry  of  weight) 

For  the  CFL  condition  needed,  if  we  have  a  weight  function  bounded  by  1,  we  can  choose  At  =  2m+4 
satisfy  this  condition.  If  we  add  a  fidelity  term,  this  is  also  enough  (if  A  <  1). 

The  complexity  of  this  algorithm  is  0(N  x  Window Size  x  ( PatchSize  +  log???.)). 
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E.8.1  Fast  approximation  for  the  fully  nonlocal  version 

The  following  algorithm  is  based  on  ideas  presented  in  [15].  It  is  simpler  and  faster  but  not  accurate, 
anyway  the  results  are  better  than  the  original  fully  nonlocal  version. 

Algorithm: 

1.  1)  Compute  the  mean  and  the  standard  deviation  of  all  patches  in  the  image.  Create  a  two  dimensional 
bin  table  such  that  all  patches  in  a  bin  are  within  a  specific  range  of  mean  and  s.d.  from  each  other. 
Both  types  of  bins  are  spaced  in  h/ 2  increments. 

2.  2)  To  construct  the  set  Afk'-  For  each  pixel  k  we  consider  the  9  bins  around  it  (3x3  window  in  the  table, 
this  ensures  that  patches  which  are  very  similar  are  taken  into  account).  Pick  randomly  3 m  patches 
from  these  bins,  check  their  similarity  to  the  patch  of  pixel  k  and  take  the  most  similar  m  of  them. 
Add  to  A 4  also  the  four  nearest  neighbors  (for  connectness) 

3.  3)  Compute  Wki  as  in  the  local  algorithm. 

E.9  Conclusions  for  denoising 

The  main  conclusions  for  the  denoising  scheme  proposed  by  the  authors  (based  on  the  several  examples 
that  can  be  found  on  the  original  paper)  can  be  summarized  as  follows: 

1.  1)  A  semi-local  search  window  Qw  =  {y  £  0  /  \y  —  x\  <  r}  performs  better  than  a  fully  nonlocal  one 
(i.e.  $lw(x)  =  f I) 

2.  2)  The  steepest  descent  flow  performs  better  than  the  variational  minimization  for  the  same  regularizer 
J{u). 

3.  3)  The  proposed  flow  performs  better  than  the  original  NL-means  filter  as  well  as  several  well-known 
local  PDE  based  regularizations. 


E.10  Algorithm  for  (Supervised)  Segmentation 

The  following  algorithm  for  nonlocal  segmentation  is  based  on  method  that  are  used  in  the  field  of  clas¬ 
sification  and  machine  learning.  The  key  point  is  realize  that  we  use  the  same  flow  for  this  task,  the  only 
thing  that  change  is  the  initial  condition.  For  more  information  about  the  deduction  of  this  model  see  the 
original  paper. 

Let  /  the  input  image  and  w(x,y )  the  corresponding  weight  function.  Let  12^  be  an  initial  set  which  corre¬ 
sponds  to  the  object  to  be  segmentated,  and  let  f Iq  be  an  initial  set  which  corresponds  to  the  background. 
In  the  following  algorithm  these  regions  are  defined  by  the  users,  who  marks  them  as  an  initial  condition, 
specifying  the  object  to  be  segmentated  and  the  corresponding  background. 


Algorithm 

1.  1)  Initialize 


(  1  X  £  fi® 

Uo  :=  <  — 1  x  £  Qq 

( 0  otherwise 


2.  2)  Evolve  for  a  duration  T  the  flow 

ut  =  Lu ,  u|t=o  =  u0  (E.15) 

3.  3)  Define  fl°,  the  set  of  nodes  approximating  the  Object,  by: 

Q°  :=  {x  £  fl  :  u(x,  T )  >  0} 
and  the  Background  by:  =  D  \  fi° 
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In  the  case  of  multiple  objects  the  algorithm  can  be  generalized  in  the  following  way:  Let  M 

disjoint  sets  which  are  parts  of  M  regions  to  be  segmented  (including  the  background),  this  is  the  data  defined 
by  the  user,  then  the  algorithm  is: 
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Algorithm 

1.  1)  Initialize  a  M  channel  signal  u‘,  i  =  1 , ,M  as  follows: 


1  x  £  tt  l0 
0  otherwise 


2.  2)  Evolve  for  a  duration  T  the  flow 

u\  =  Lu\  td  |t— o  =  Uq 

3.  3)  Define  the  set  of  nodes  approximating  the  Object  i,  by: 

Cl1  :=  {x  £  Cl  :  i  =  argma xh^i,!)} 


(E.16) 


As  a  few  remarks: 

1.  This  algorithms  can  be  related  to  image  coloring,  where  a  grey-scale  image  can  be  colored  by  user- 
provided  coloring  examples 

2.  A  typical  example  where  this  algorithm  fail  is  when  the  user  marks  are  not  even  (geometrically  speaking) 
and  this  marks  dont  sorround  the  object.  An  example  of  this  can  be  found  on  the  original  paper. 
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E.ll 


Numerical  Examples 


9  f 


NL-means  (original)  NL-SS  (nonlocal) 


Figure  E.l:  Top  row:  Test  images.  Clean  image  g  (left),  noisy  image  /.  Cameraman  image  filtering  result 
u.  Second  row:  NL-means  (nonlocal),  and  filtering  by  nonlocal  scale-space  (proposed  flow).  Third  row: 
NL-means  (11  x  11  window),  proposed  nonlocal  scale  space  (11  x  11  window). 
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Input  image 


u.  Iter  1000 


Object 


User  marks:  O  (light),  B  (dark) 


Figure  E.2:  Horse  silhouette,  with  spots  in  both  object  and  background,  white  Gaussian  noise  is  added. 
4-neighbor  scheme.  In  the  second  and  third  rows,  the  advancement  of  the  information  with  the  iterations  is 
illustrated 


Input  image 

Object 


User  marks:  O  (light),  B  (dark) 


Figure  E.3:  Failure  example:  Marks  of  the  background  are  too  sparse  and  uneven. 
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F.  Generalised  Nonlocal  Image  Smoothing  -  L.  Pizarro, 
P.  Mrazek,  S.  Didas,  S.  Grewenig,  J.  Weickert  [12] 


The  main  idea  of  this  paper  is  to  provide  a  discrete  variational  approach  for  image  smoothing  based  on 
non  local  similarity  and  regularization  constraints  which  penalize  general  dissimilarity  measures  defined  in 
image  patches. 

The  classical  similarity  measure  is  the  weighted  L 2  distance  between  patches,  as  we  seen  before  in  NL-means. 
For  this  purpose  the  authors  first  introduce  the  NDS  model  defined  by  Mrazek,  which  generalizes  other 
models  in  the  sense  that  with  a  choice  of  parameters  this  model  reduces  to  known  others. 

After  that,  the  authors  propose  a  modification  of  the  NDS  model,  the  Generalised  NDS  model  or  GNDS, 
which  includes  the  patch  similarity  ideas  that  can  be  seen  on  the  NL-means  filter  which  leads  to  more 
versatility  and  robustness  for  local  structure  of  the  image. 


F.l  NDS  model:  Nonlocal  Data  and  Smoothness 

Let  f,u:  D  — >■  R  scalar  images  defined  in  the  discrete  domain  Q,  /  must  be  understood  as  the  original 
(noisy)  image  while  u  represents  the  “processed”  version  of  /.  Also,  let  J  =  {1, . . . ,N}  the  index  set  of  all 
pixels  in  the  images.  We  denote  the  pixel  i  as  Xj. 

The  discrete  Energy  JNDS  of  the  NDS  filter  proposed  by  Mrazek  in  2006  is  a  convex  combination  of  a 
nonlocal  data  (or  similarity)  term  FSim  and  a  nonlocal  smoothness  (or  regularization)  term  Freg ,  which  are 
given  by: 

Fsimiu )  ■  =  ^  ^(K  -  fj\2)ws(\Xi  -  Xj\ 2)  (F.l) 

i,j£j 

Fregi^)  :=  ^  ,  Tr  ( |  Uj  Uj\  )wr(\Xi  Xj  |  )  (F.2) 

i,jeJ 

Where  T.  :  Rq"  — »•  Rq"  are  increasing  functions  which  penalizes  large  tonal  distances  (in  greyscale). 

The  weights  w.  :  Rq"  — >  Rq"  are  non  negative  functions  which  penalizes  large  distances  between  pixels. 

Then,  the  complete  NDS  model  could  be  understood  as  a  discrete  nonlocal  variational  method  which  combine 
data  and  smoothness: 

JNDS(u)  =  (1  -  a)Fsim{u)  +  aFreg(u)  (F.3) 

where  a  £  [0, 1]  is  a  regularization  parameter. 


F.1.1  Critical  Points  and  Numerical  Implementation 


Is  interesting  to  study  the  critical  points  of  this  functional,  this  leads  to  a  fixed  point  scheme  for  minimizing 
the  energy  functional  and,  under  suitable  hypothesis,  this  leads  to  a  stable  procedure. 

First  of  all,  we  need  to  calculate  the  derivatives  of  main  terms: 


dFm 


duk 

BF 

reg 

duk 


=  -  fj\2)(Uk  -  fj)ws(\xk  -  Xj\2) 


=  4^2%(\uk  ~  Uj^WrQxk  -  Xj\2) 

j£J 


Notice  that  derivatives  on  T  are  respect  of  its  arguments,  then  by  linearlity: 


BJNDS  ,BF< 


dFr, 


duk 

gjNDB 


duk 

=  0  Vfc  G  J. 


Then,  for  a  critical  point  we  want:  \/  JNDS (u)  =  0  <t=>  — ^ 

If  we  call:  Sij  :=  \Ps(|ui  -  fj \2)ws(\xi  -  Xj\2) 

rig  :=  2'&'r(\ui  —  Uj\2)wr(\xi  —  Xj  |2),  then  the  critical  point  condition  becomes  to: 

0  =  (1  -  «)  s^(ui  ~  fj)  +  aYl  riAui  -  ui) 

jeJ  jeJ 


(F.4) 

(F.5) 


(F.6) 
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which  can  be  rewritten  in  the  fixed  point  form: 


/-i  \  i  V'' 

(1  -  «)  Eje  j  Si,j  +  a  Lie  j  nj 

To  have  a  well  defined  expression  we  will  need  that  T.  are  increasing  functions,  also  we  need  a  positive  weight 
functions  (i.e.  w.(s2)  >  0).  Then,  from  the  last  expression,  the  authors  propose  the  following  fixed  point 
scheme: 


<  :=  U 


u k+1  = 


(t  -  a)  ]rjeJ  stjfr + a  EjeJ 
(!  -  «)  E jej  si,i  +  Q'  E jeJ  ri,j 


(F.8) 


or,  in  a  compact  form:  uk+1  =  F(uk)  where  F  :  — »  R^  is  a  vector  function  defined  on  each  coordinate 

by  the  right  side  expression  of  (F.7).  This  is  usually  called  as  “Nonlinear  Jacobi  Method” 

This  fixed  point  scheme  has  two  simple  but  important  properties: 


Proposition  F.1.1.  Under  the  hypothesis  on  \1/.  and  w.  as  above,  the  scheme  (F.7)  satisfies  a  maximum- 
minimum  principle: 

min/,  <  uk  <  max/,  Vi  €  J,  k  €  N  (F.9) 

jeJ  ^  j&J  K 

This  proposition  leads  to  an  important  theorical  result,  which  guarantees  the  existence  of  minimizers  (due 
to  their  behaviour  of  critical  points  as  a  fixed  points  of  the  operator  F) 

Proposition  F.1.2.  The  fixed  point  equation  (F.7)  has  a  solution. 

The  proof  relies  on  Brouwer  fixed  point  theorem  and  continuity  of  the  operator  F  due  to  the  first  propo¬ 
sition. 


Alternatively,  the  minimization  of  the  NDS  energy  can  be  found  using  a  gradient  descent  optimization, 
i.e.  considering: 


uk+ 1  -  Uk  dJNDS 


\/i  G  J 


t  duk 

where  r  >  0  is  the  step  size.  In  the  same  way  as  the  first  iterative  scheme,  for  this  we  have: 


u°i  ■■=  fi  uk+1  =  (1  -  r)uk  +  r 


(!  ~  oQ  Eje  J  a*j/j  +  Q  Ej6  J 
(!  ^  a)  YOjej  skj  +  a  J2jeJ  rfj 


(F.10) 


Notice  that  if  we  set  r  =  1  we  obtain  the  fixed  point  iteration  (F.8) 


F.1.2  Important  Cases 

Let  us  rewrite  the  equation  (F.7)  in  a  more  specific  way  (in  the  sense  as  we  can  see  the  dependances  on 
parameters  like  the  window  size  for  w  (which  we  will  denote  by  w),  for  example) 

Then,  consider  the  following  notation: 


hh-KQ  ui-ftf) 


hrid~  2%(\ui-uj\2) 


s.w 

Wi,j  :=ws 


\Xj  —  x. 


|2v 


r.w 

Wfj  ■ =Wr 


\Xj  —  X- 


|2v 


notice  that  a  small  w  denotes  a  local  operation  (or  a  smaller  window),  and  a  larger  one  denotes  a  large-scale 
effects.  The  window  size  for  the  data  and  smoothness  may  differ  (because  the  functions  are  not  necesarily 
the  same),  with  this  notation  (F.7)  becomes  to: 


(!  -  a)  J2jej  hijwi’,j 


'fi+BXjtjhWJUj 


(!  -  «)  Eje  j  KjwiJ 


+  oc  E 


hr  nnr'w 

jeJ  i,3 


(F.ll) 


From  this  equation,  and  choosing  apropiate  configuration  of  w,  a  and  functions  involved  we  can  reduce  this 
model  to  a  variety  of  classical  models,  the  following  picture  shows  the  landscape  covered  by  the  NDS  model, 
we  refer  to  the  original  paper  for  more  detailed  deduction  for  this  reductions. 
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M-estimators  /  histogram  operations 


I 

U 


£ 

o 

- 

z 

5 
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local  M-smoothers  |  NDS  functional 

o  =  0 
DATA 


bilateral  filters 


o  =  l 

SMOOTHNESS 


m  -¥  0 

Bayesian  /  regularisation  /  diffusion  filtering 


F.2  Generalised  Nonlocal  Data  and  Smoothness  (GNDS)  Model 

From  the  NDS  model  one  can  see  that  the  tonal  weights  (U/.)  just  depend  on  differences  between  single 
pairs  of  pixels  (i.e.  the  similarity  measure  just  depends  on  single  pixels),  this  single  differences  have  limited 
ability  to  express  local  image  structure  and  geometry,  moreover,  for  practical  purposes  usually  this  differences 
applies  only  on  a  relatively  small  neighborhood. 

The  idea  for  this  generalized  model  is  to  add  the  a  more  powerful  measure  to  evaluate  similarity,  so  the 
authors  consider  the  concept  of  self-similarity  of  the  whole  image  based  on  models  like  NL-means  of  Buades 
et  al.  to  extend  the  previous  model  to  a  more  versatile  one.  Then,  the  idea  is  to  consider  a  measure  which 
compares  the  similarity  of  a  whole  patch  around  a  pixel  between  the  whole  patch  around  the  other  pixel,  this 
idea  is  usually  called  patch-similarity. 


F.2.1  GNDS  Functional  and  Its  Minimization 

Let  us  consider  the  following  functions  called  “tonal  distance”:  ds,dr  :  R2p  — ►  Rjj"  in  the  similarity  and 
the  regularization  term.  For  example,  in  the  similarity  term,  such  a  function  calculates  the  distance  between 
two  image  patches  u(Vi)  and  f{Vj)  of  the  initial  image  (in  the  case  of  the  regularization  term  we  change  this 
by  u(Vj)).  Where  the  index  set  Vi  denotes  the  image  patch  as  neighborhood  of  the  pixel  i.  Both  patches  are 
assumed  to  have  the  same  size  p  e  N  and  the  same  shape. 

The  standard  distance  function  for  this  purpose  is  the  weighted  L2  norm  used  in  NL  means  algorithm,  defined 
by: 

\d(u(Vi),  f(Pj))\  =  ga(p)(ui+p  —  fj+p) 

p 

where  ga(p)  :=  exp(—  p2 /(2a2)). 

With  this  definitions,  the  GNDS  model  is  defined  by: 

JGNDS(u)  =  (1  -  a)FGm(u)  +  aFGg(u) 

=  (!-«)  X!  ^^\ds{u('Pi)J('Pj))\2)-Ws{\xi-xj\2) 

i,jeJ 

+  a  X!  ^r{\dr{u{Vi),u{Vj))\2)  ■  wr(\xi  -  xj\2) 
i,jeJ 


The  minimization  strategy  is  the  same  as  for  the  NDS  model,  the  idea  is  to  find  a  fixed  point  form  for  this 
new  functional.  The  minimizer  of  ,JGNDS  necessarily  satisfies: 


dJGNDS 

Out 


(1  —  a) 


d  Fg 

^  sirr 

dui 


dFG 

OUi 


Vie  J 


where: 


dFG 

sim 

Out 

dFG 

^  reg 

dui 


da  *  V/ls(d2s.k_.J_.){0){uk  -  fj)  ■  ws(\Xi  -  xj\2) 

3&J 

4  9a*  ^(4;fc-j-)(0)K  -  Uj)  ■  Wr(\Xi  -  Xj\2) 

jeJ 
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Now  if  we  consider  the  following  notation: 


hff  :=  2 ga  *  <(KW^),  «(^))|2)(0) 

we  have  the  following  fixed  point  equation  for  the  GNDS  model: 


(1  -  «)  Eie  j  h?Jw 


S,W 

i,j  Jo 


fj  +aT,jejh?fw^^uo 


(!  -  «)  Eie  j  h?,jwij  +  a  Eie  j  Kffwl 


r,W  , 

r,w 

i,j 


(F.12) 


In  the  same  way  as  in  the  NDS  fixed  point  equation  we  can  deduce  a  min-max  principle  and  the  existence  of 
a  fixed  point,  also,  a  minimizer  can  be  found  via  gradient  descent  method. 

The  most  important  difference  between  this  model  and  the  NDS  model  is  the  use  of  patch  distances  instead 
of  single  difference  between  pairs  of  pixels,  this  induces  a  patch  similarity  measure,  with  is  more  powerful 
and  versatile  for  our  purposes. 


F.2.2  Double  Weighting:  Generalization  of  the  model  and  Particular  cases 

Let  us  focus  only  on  the  similarity  term  (the  same  applies  analogously  for  the  regularization  term)  and 
expand  the  terms  of  the  fixed  point  equation: 

'J  j,p€J 

where  Mjj  denotes  the  normalization  term. 

Notice  that  we  have  the  Gaussian  term  ga  two  times  in  the  formula,  once  during  the  patch  similarity  calcu¬ 
lation  (sum  over  g),  we  will  call  this  inner  weighting  of  patch  pixels,  and  then  it  appears  in  the  sum  over 
p ,  we  will  call  this  the  outer  weighting  which  is  applied  when  summing  the  results  of  the  function  'I''. 

Is  interesting  to  see  that  what  this  formula  means  is:  For  pixels  m  and  fj  we  first  calculate  the  patch  distances 
of  all  patches  at  positions  i+p  and  j  +p  taken  with  the  offset  p  around  tq  and  fj  respectively.  Then,  average 
this  patches  distances  (transformed  by  the  nonlinearlity  T')  using  the  outer  weight  ga.  Thus,  the  pixel  fj 
will  contribute  to  the  result  tq  with  a  high  weight  only  if  the  patches  around  rq  and  fj  are  similar  AND  the 
patches  around  rq+p  and  fj+p  are  similar. 

A  natural  question  arises  when  we  look  at  the  expression  (F.13),  we  have  the  two  weights  (inner  and  outer) 
with  the  same  parameter  (a)  but  with  different  role  in  the  expression,  the  parameter  is  the  same  because 
this  becomes  from  the  derivation  of  the  energy  functional,  but,  since  the  role  between  them  is  different,  is 
natural  to  ask  what  happens  if  one  change  the  parameter  of  the  weights.  If  we  consider  that  the  weights  have 
different  parameters  we  have: 


i+p+q  fj+P+qf  (F-13) 

W  / 


ui  ^  '  9p{p)  '  ^  j  ^  '  9<y{o)\ui+p+q  fj+p+q\  j  wi,jfj  (F-14) 

l,J  j,p&J  \q£J  J 

This  simple  observation  makes  a  more  generalized  version  of  the  model  (obviously  this  is  not  derived  from 
the  original  functional,  but  is  the  generalized  version  of  this  one  since  if  we  take  p  =  a  we  are  in  the  original 
model  again),  and  establishes  a  family  of  filters  from  the  GNDS  model.  This  is  the  most  general  version 
of  the  model,  and  the  following  table  shows  the  particular  cases  that  reduces  the  model  to  a  known  other 
models,  (by  adjusting  the  parameters  AND  the  involved  functions). 
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Table  F.l:  Filtering  Methods  Belonging  to  GNDS  Family 
Regularization  parameter  Patch  size  Integration  Scale  Method 

[0.5ex]  0  <  a  <  1  cr  >  0  p  =  a  Original  GNDS 

a  €  {0, 1}  a  >  0  p  — >  0  NL-means  (original) 

a  €  {0, 1}  a-  — >  0  p  >  0  NL-means  (anisotropic) 

0<a<l  p  — >  0  a  0  Original  NDS 

[lex] 

A  remark  on  the  complexity  of  this  scheme:  Given  an  image  of  N  pixels,  a  square  search  window  w  with 
s2  pixels  and  a  circular  patch  of  radius  r,  the  computational  complexity  of  the  generalized  filter  family  (i.e. 
the  fixed  point  scheme  in  the  most  general  version,  with  two  different  weights:  gp,ga  is:  0(N  x  s2  x  r2  x  r2) 
And  finally,  a  remark  in  the  case  of  multichannel  images:  The  extension  of  GNDS  model  to  multichannel 
images  is  natural,  we  just  need  to  consider  a  patch  distance  based  in  a  vector  way,  computing  the  norm 
between  vectors  u  and  f  (these  are  vectors  with  d  components,  where  each  component  correspond  to  a 
different  channel),  i.e.: 

\d(u(Vi),  f(Vj))\2  =  ^2ga(jp)\\ui+p  -  fj+p\\l 

p 

using  the  euclidean  norm.  The  optimality  conditions  are  the  same  as  the  original  model,  just  considering  it 
on  each  channel. 
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F.3  Numerical  Examples 


®  PS  NR  =  30.64.  GNDS  (k)  PS  NR  =  29.80,  GNDS  0)  PSNR  =  32.05,  GNDS 


Figure  F.l:  Comparison  to  state-of-the-art  methods.  Top  Row:  Test  images  degraded  by  Gaussian  noise 
with  standard  deviation  20.  2nd  Row:  Restored  images  by  Dabov  et  al.  (2007).  3rd  Row:  Restored  images 
by  Kervrann  and  Boulanger  (2008).  Bottom  Row:  Restored  images  by  the  proposed  GNDS  model. 
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G.  Nonlocal  Mumford-Shah  Regularizes  for  Color 
Image  Restoration  [14] 


The  main  goal  of  this  paper  is  to  develop  several  functionals  based  on  approximations  of  the  original 
Mumford-Shah  functional  with  nonlocal  characteristics  incorporated,  this  nonlocal  characteristics  are  based 
on  the  work  of  Buades  et  al.  and  Gilboa  et  al. 

Is  important  to  incorporate  the  nonlocal  characteristics  in  new  models  because  this  perform  better  than 
local  methods  in  image  denoising  and  restoration  when  the  image  have  textures  (for  example,  local  methods 
usually  consider  textures  as  noise,  and  then,  in  denoising  tasks  this  algorithms  just  remove  the  textures). 
The  authors  presents  non  local  extensions  for  the  widely  known  approximations  for  the  Mumford-Shah 
functional,  this  approximations  are  due  to  Ambrosio-Tortorelli  and  Shah  with  the  primary  objective  of  better 
restoration  of  fine  structures  and  textures.  Some  applications  (and  algorithms)  are  presented  for  the  following 
image  tasks: 

1.  Color  Image  Deblurring  and  Denoising 

2.  Color  Image  Inpainting 

3.  Color  Image  Super-Resolution 

4.  Color  Filter  Array  Demosaicing 

G.l  Introduction  -  Background 

First  of  all,  we  need  to  recall  some  basic  results  and  concepts  about  image  regularization  methods. 

G.1.1  Local  Regularizers 

The  basic  Mumford-Shah  regularizating  functional  is  used  commonly  in  segmentation  and  restoration 
algorithms,  it  is  given  by  the  following  formulation: 

Given  u  :  — >  R.  and  K  its  edge  set,  the  MSH 1  regularizer  is: 

JMSH\u,K)  =  /3  [  \\7u\2dx  +  a  f  dU 1 

Jq\k  Jk 

where  |Vu|  =  \/u21  +  it|2,  x  =  (#1,2:2),  di1  is  the  1-D  Hausdorff  measure  and  H  C  K2  is  the  open  image 
domain.  Notice  that  the  first  term  enforces  u  to  be  smooth  everywhere  except  on  the  edge  set  K,  and  the 
second  term  enforce  to  minimize  the  total  length  of  edges. 

In  general  is  very  hard  to  minimize  (in  practice)  this  functional,  due  to  its  non  convex  behaviour.  A  way  to 
solve  this  problem  is  to  consider  another  functionals  (with  better  structure)  which  approximate  this  one  in 
some  sense  that  asserts  that  the  minimum  points  of  this  new  functionals  approximates  the  minimum  points 
of  the  original  one. 

Ambrosio  and  Tortorelli  approximated  the  Mumford-Shah  functional  by  considering  a  sequence  of  more 
regular  functionals  denoted  by  Je  which  converges  to  JMSH  in  the  sense  of  T-convergence.  The  idea  of  this 
functional  is  to  approximate  the  edge  set  K  by  a  smooth  function  v,  the  approximation  is  given  by: 

J^SH  (u,v)  =  (3  J  v2\S7u\2dx  +  a  J  ^e|V#|2  +  ^  ^  ^  ^  dx 

where  0  <  v(x)  <  1  represents  the  edges:  v(x)  ~  0  if  x  £  K  =  Ju  (jump  set  of  u),  v(x)  ~  1  otherwise;  e  is 
a  small  positive  constant,  a,  /3  positive  weights.  If  we  add  a  fidelity  term  to  this  functional  we  have  that  a 
minimizer  u  =  ue  of  J^SH  approaches  a  minimizer  of  JMSH  as  e  — ►  0 

Another  approach  is  given  by  Shah  using  the  total  variation  regularization  proposed  in  image  restoriation 
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mainly  by  Rudin,  Osher  and  Fatemi,  this  is  very  useful  due  to  its  benefits  of  preserving  edges  and  convexity. 
The  total  variation  regularization  is  defined  in  the  following  way:  given  a  locally  integrable  function  u  define: 

JTV(u)  =  sup{  [  uV  •  <j>dx  :  <t>  £  ||</>|U~(n)<1} 

Jn 

which  coincides  with  fn  |Vit|cte  when  u  £  Wl,1(£l). 

Based  on  this,  Shah  proposed  a  modified  version  of  Ambrosio-Tortorelli  approximation  by  replacing  the  term 
|Vit|2  by  |Vw|  in  the  first  term,  then,  the  Shah  approximation  for  the  Mumford-Shah  functional  is  given  by: 

J^ISTV (u,v)  =  f3  J  v2\Vu\dx  +  a  j  ^g|Vz;|2  +  —  ^  ^  ^  dx 

This  functional  T-converges  to  the  JMSTV  functional  given  by: 

JMSTV  =  p  [  \Vu\dx  +  a  I  -  ^7“  Li dH1  +  \Dcu\(n) 

Jn\K  Jk  1  +  \u+  -  u  | 

Where  u+,u~  denotes  the  values  of  u  ‘at  each  side’  of  K  and  Dcu  is  the  Cantor  part  of  Du.  This  last 
functional  is  very  similar  with  the  total  variation  of  u  £  BV(fl)  that  can  be  written  for  K  =  Ju  as: 

JTV  =  (3  f  \S7u\dx  +  a  f  \u+ ~  u~ \dl~L1  +  \D cu\(£l) 

Jn\K  Jk 

The  only  difference  is  that  the  MSTV  regularizer  does  not  penalize  the  jump  part  as  much  as  the  TV  regu- 
larizer  does. 

This  functionals  are  considered  only  for  monochromatic  images,  but  is  naturally  extended  to  color  images  by 
Blomgren  and  Chan,  which  propose  a  color  TV  regularization  by  coupling  the  channels,  i.e.  considering: 

JTV  =  f  ||Vu||dx  =  /  J\S7uR\2  +  |VuG|2  +  \\/uB\2dx 
Jn  Jn  v 

Bar  et  al.  extend  this  idea  for  the  Mumford-Shah  approximations  for  color  images,  by  replacing  |Vu|  by 
||Vit||  in  J^SH  and  J^ISTl  .  Notice  that  the  scalar- valued  edge  map  v  is  common  for  the  three  channels  and 
provides  the  necessary  coupling  between  colors. 


G.1.2  Nonlocal  Methods 

As  we  seen  before  in  review  of  the  paper  of  Buades  et  al.  the  importance  of  nonlocal  methods  is  based  on 
their  well  adaptation  to  texture  denoising  in  contrast  to  standard  local  methods.  Recall  that  the  basic  idea  is 
to  extend  the  concept  of  neighborhood  filters  which  replace  the  value  of  a  pixel  with  an  average  of  its  spatial 
neighbors,  the  nonlocal  filters  extend  this  concept  to  the  one  of  patch-similarity,  i.e.  we  will  replace  the  value 
of  a  pixel  for  an  averaging  of  pixels  which  have  similar  patch  values,  (and  then,  the  spatial  restriction  is 
relaxed).  The  classical  filter  for  this  task  is  the  NL-means  filter  due  to  Buades  et  ah: 

att  t  ti  w  1  /'  (  da(f(x),f(y))\ 

NL{fix))  =  cw  j™  { - is - J  !{v)dy 


da(f(x),f(y))=  f  9a(t)\\f(x  +  t)  —  f{y  +  t)\\2dt 
J  R2 

where  da  is  the  patch  distance,  /  is  the  image  to  be  filtered  and  ga  is  a  Gaussian  kernel  with  standard 
deviation  a  which  determines  the  patch  size.  For  more  information  of  this  method  you  can  see  the  chapter  3 
of  this  review. 


G.1.3  Nonlocal  Regularizes 

The  idea  of  this  regularizers  is  to  see  the  nonlocal  filtering  as  a  quadratic  regularization  based  upon  a 
nonlocal  graph  (a  graph  with  weights).  The  most  important  contributions  on  this  field  are  given  by  Gilboa 
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and  Osher. 

We  will  need  some  operators  from  this  theory,  the  so  called  “non  local  differential  operators  over  graphs” 
which  are  proposed  by  Gilboa  and  Osher. 

Let  u  :  — ►  R  and  w  :  Cl  x  — ►  R  a  non  negative  and  symmetric  weight  function.  We  define  the  non  local 

gradient  vector  Vwu  :  Q  x  Q  — >  R.  as: 

( Vwu)(x,y )  =  (u(y)  -  u(x))y/w(x,y) 

And  the  norm  of  the  nonlocal  gradient  of  u  is  defined  by: 


\Wwu\(x)  :=  y  J  ( u(y )  -  u(x))2w(x,y)dy 


We  also  define  the  non  local  divergence  of  the  vector  v  :  x  Q  — >  R  by: 

(divwv)(x)  :=  /  (v(x,  y)  -  v(y,  x))y/w{x,  y)dy 
Jn 

Inspired  in  this  operators,  Gilboa  and  Osher  propose  the  following  general  form  for  nonlocal  regularizating 
functionals: 


J(u)  =  /  <f>{\Vwu\2)dx 
Jn 


where  s  K >•  <f>(s)  is  positive,  increasing  and  convex  in  y/s,  and  cf>( 0)  =  0. 

If  <j)(s)  =  yfs,  they  propose  the  NL/TV  (NonLocal  Total  Variation)  regularizer: 


jNL/TV(uj_  j  \ywU\dx=  f  J  f  (u(y)  -  u(x))2w(x,y)dydx 
Jo,  Jo,  V  Jfi 

Which  coincides,  in  the  2D  local  case  to  JTV (u)  =  fn  \Wu\dx 


G.2  Proposed  Nonlocal  Mumford-Shah  Regularizers 

Based  on  the  above,  the  authors  propose  nonlocal  versions  of  the  approximating  functionals  of  Ambrosio- 
Tortorelli  and  Shah  for  the  Mumford-Shah  functional,  and  considering  an  appropiate  fidelity  term  they 
develop  functionals  for  the  following  image  tasks: 

1.  Deblurring-Denoising 

2.  Inpainting 

3.  Super-Resolution 

4.  Demosaicing 

Is  important  to  recall  that  they  also  incorporate  the  vector  case  (i.e.  color  images)  in  their  formulation,  in 
the  way  as  we  seen  above. 

Then,  the  general  model  proposed  by  the  authors  is: 

JNL/MS(u,v)  :=  p  J^v2<f>(\\V  wu\\2)dx  + a  ^|Vu|2  +  dx  =  F^/MS(u,  v)  +  FAT{v) 

' - - V - '  ' - - - - - ' 

fZl/ms  Fat 

where  u  :  O  — >  R3,  v  :  O  — >  [0, 1]  and  (f>(s)  =  s  or  <j>(s)  =  yfs  (the  first  choice  correspond  to  N L / AI S H1  and 
the  second  to  MS/TV)  i.e.: 

JNL/msh\u,v)  :=  p  J^v2\\Vwu\\2dx+a  ^|Vu|2  +  {v  dx  =  F%jg‘ S(u,v)  +  FAT(v) 

V — - v - ' 

j-iNL/MS 

*regAT 
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jNL/mstv (U)W)  :=/3  jf  v2||Vll)u||d®+a  jT  (e|Vu|2  +  (v  4e1}~^)  dx  =  F^MS(u,v)  +  Fat(v) 


t-.NL/MS 

*regS 

and  recall  that: 


||Vwit||(a:) 


ui{y))2w{x,y)dy 


as  we  said  before,  adding  a  fidelity  term  to  this  functionals  we  will  be  able  to  perform  a  specific  restoration 
task,  we  will  discuss  this  in  the  next  section. 

As  an  additional  remark,  in  the  practice  the  weight  function  that  will  be  used  is  the  classic  NL-means  weight 
(given  an  image  q ): 


w(  x, y )  =  exp 


da{q{x),q{y))\ 
h 2  ) 


and  we  use  search  windows  S(x)  =  {y  /  \x  —  y\  <  r} 


G.3  Image  Restoration  Tasks  with  NL/MS  Regularizes 

As  we  said  before,  choosing  an  appropiate  fidelity  term  we  will  construct  functionals  for  several  tasks  of 
image  restoration,  also  sometimes  we  will  need  to  do  a  pre-process  to  image  in  order  to  construct  the  weight 
function  appropriately. 


G.3.1  Deblurring-Denoising 

The  standard  degradation  model  for  deblurring  and  denoising  is: 

/  =  k  *  u  +  n  (/*  =  k  *  ul  +  n\i  =  R,G,  B ) 


where  k  is  the  associated  (known)  blurring  kernel  and  n  is  an  additive  noise  (which  can  be  Gaussian,  Laplacian 
or  impulse  noise  (in  the  last  case  a  preprocessing  is  needed,  because  the  model  is  no  longer  valid  in  this  case)). 
Then,  in  the  case  of  Gaussian  noise,  the  L 2  fidelity  term  is  commonly  used  (this  is  led  by  maximum  likelihood 
estimation) 


<&(/  —  k*  u) 


El  r~k*' 


2dx 


For  the  impulse  noise  (or  Laplacian)  the  L1  fidelity  term  is  more  appropriate  (due  to  its  better  results  on 
removing  outlier  effects),  then,  considering  the  case  of  independent  channels  noise  we  have: 


$(/  —  k*u) 


[  T  i  r  —  fc  *  ui\dx 
In  ■ 


Then,  the  proposed  types  of  total  energy  for  color  image  deblurring-denoising  are: 

JGau(u,  V)  =  \f  El  f-k*  y2dx  +JNL'MS{u,  v)  =  F°™(u)  +  JNL/MS(u,  v) 
IJn  . 

' - v - ' 

JImp(u,  v)  =  f  E  If  -  k  *  v/\dx  +JNL'MS(u,  v)  =  F'2P(u)  +  JNL,MS{u ,  v) 
Jn  ■ 


The  proprocess  needed  for  the  construction  of  a  image  g  needed  for  the  construction  of  weight  function  w  is 
given  by: 

1.  Initialize:  rl0  =  0,  gl0  =  0,  i  =  R ,  G,  B 
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2.  do  (iterate  m  =  0, 1, 2, . . .) 

•  gm+ 1  =  median(f  +  rm,  [s  s]) 

•  rm+ 1  =rm  +  f  -k*  gm+1 

while  II. f  ~k*gin\\i>  Ei  Wf  ~k*  9lm+ illi 

G.3.2  Inpainting 

In  this  problem  we  have  an  observed  image  f  given  by: 

f  =  u  in  Q\  D 

Where  D  =  D°  is  the  region  where  the  input  data  u  is  lost.  Inspired  in  the  work  of  Chan  and  Shen  [4],  the 
authors  propose  the  following  energy  functional  for  inpainting: 

JInp(u,  v)  =  \jn  ln\n(*)  E  I  f  ~  ul\2dx  +JNL/MS(u,  v)  =  FlZ{u)  +  JNL'MS{u,  v) 

s - J~ - ' 

plnp 

Also,  the  weights  w  are  updated  only  in  the  damaged  region  D  in  the  m- th  iteration  for  u  using  the  modified 
patch  distance: 

da(u(x),u(y))  =  [  ln\R(x  +  t)ga(t)\\u(x  +  t) -u(y  +  t)\\2dt 

J  R2 

where  R  C  D  is  a  unrecovered  region  (still  missing  region). 

Then,  the  missing  region  D  =  D°  can  be  recovered  by  the  following  iterative  algorithm,  which  produces 
regiones  D%,  i  =  0, 1,  2, . . .  con  Dn  C  . . .  C  D° 

Algorithm 

1.  Compute  weights  w  for  x  €  S2  such  that  P(x)  (~l  (fl  \  D°)  ^  0  using  d^°  (u° {x) ,  u° (y))  with  u°  =  f  in 
f 1  \  D°  and  oo  in  D° ,  a  patch  P(x)  centered  at  x,  and  y  £  S(x)  D  (fl  \  D°). 

2.  Iterate  n  =  1, 2, ...  to  obtain  a  minimizer  (it,  v )  starting  with  u  =  u°: 

(a)  For  fixed  u,  update  v  in  f l  to  obtain  vn 

(b)  For  fixed  v,  update  u  in  Q  to  obtain  un  with  a  recovered  region  V.  \  Dn:  at  every  m-th  iteration, 
update  weights  w  only  in  x  £  D°  subject  to  P{x)  (~l  (fl  \  Dm’n)  yf  0  with 

dam’n  Mx),u(y)) 

where  y  £  S(x)  (~l  (fl  \  £)ra,m),  Dn’m  is  and  un-recovered  region  in  D°  until  m-th  iteration  with 

j^n  _  j~yn,n  (E  •  •  •  (E  d  /)n,m 

G.3.3  Super-Resolution 

The  idea  of  this  process  is  to  recover  a  high  resolution  image  from  a  filtered  and  down-sampled  image 
(This  process  is  usual  in  a  sequence  of  images  in  video).  Our  observed  data  is 

/*  =  Dk(h*  «*)  i  =  R,G,B 

where  h  is  a  low-pass  filter,  D k  '■  RnXTl  — >  Mpxp,  where  p  =  [ n/k 2]  is  the  down-sampling  operator  with  a 
factor  k  on  each  axis.  We  want  to  recover  u  £  (Rnxn)3,  this  could  be  done  minimizing: 

JSup(u,v)  =  i/E  If  -  Ac  (h  *  ui)\2dx+JNL/MS[u,v)  =  F^(u)  +  JNL'MS{u,v) 

2  A  i 

v, _  ^ 

pSup 

In  addition,  for  the  computation  of  the  weights  w  we  need  a  super-resolved  image  g  £  (Knxrl)3  obtained  by 
bicubic  interpolation  of  /.  Notice  that  g  is  only  used  for  the  computation. 
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G.3.4  Demosaicing 

The  idea  is  to  reconstruct  a  full  color  image  from  the  incomplete  color  samples  output  from  an  image 
sensor  overlaid  a  color  filter  array  (CFA).  A  CFA  is  a  mosaic  of  color  filters  in  front  of  the  image  sensor  (we 
use  here  the  Bayer  filter  [2]).  Since  each  pixel  of  the  sensor  is  behind  a  color  filter,  the  output  is  and  array 
of  pixel  values,  each  indicating  a  raw  intensity  of  one  of  the  three  filter  colors.  In  this  variational  framework, 
we  consider  the  observed  data  /  as: 

f  =  H'  -ul  i  =  R ,  G,  B 

where  •  is  the  pointwise  product,  and  H 1  is  the  down-sampling  operator;  H°  has  alternating  1  and  0  values 
for  odd  rows  and  alternating  0  and  1  values  for  even  rows,  HR  has  alternating  0  and  1  values  for  odd  rows 
and  1  and  only  0  values  for  even  rows,  HB  has  only  0  values  for  odd  rows  and  alternating  1  and  0  values  for 
even  rows.  The  authors  propose  the  following  minimization  problem  to  recover  a  full  color  image  u: 

JDemo{u ,  v)  =  l  [  V  |  f  -  Hl  ■  ul\2dx  +JNL/MS{u,  v )  =  F^mo(«)  +  JNL/MS{u ,  v) 

s - v - / 

F?i%no 

Moreover,  for  the  computation  of  the  weight  function  w,  the  authors  use  the  interpolated  image  obtained  by 
applying  Hamilton- Adams  algorithm  [8]  for  the  green  channel  and  bilineal  interpolation  for  R-G  and  B-G. 


G.3.5  Numerical  Discretizations 


For  minimize  the  proposed  functionals  in  u  and  v  we  will  consider  the  Euler  Lagrange  equations,  notice 
that: 


QjGau,Im,Inp,Sup,Demo 


dv 


=  2(3v(j)(\ |  S7  wu\  |2)  —  2eaAv  +  a 


v  —  1 
2e 


=  0 


dJGa 

du 

dJIm 


=  k*(k*u  —  f)  +  Lnl/msu  =  0 


=  k  *  sign(k  *u  —  f)  +  Lnl^msu  =  0 
du  =ln\D(u-f)  +  LNL/MSu  =  0 


du 

Qjinp 


8JSup 

du 


=  h*(Dl ( Dk(h  *u)-f))  +  Lnl/msu  =  0 


dJDe 


du 


=  H  ■  (H  ■  u  -  f)  +  Lnl/msu  =  0 


where  k(x)  =  k(—x),  h(x)  =  h{—x),  is  the  transpose  of  Dk  (i.e.  the  up-sampling  operator)  and: 

lnl/msu  =  _2  f  {(u(y)  -  u(x))w(x,y)  ■  [(v2(y)(l)'(\\X7wu\\2(y)) +  v2(x)(l)'{\\Vwu\\2{x))]}dy 
Jn 

To  solve  two  Euler-Lagrange  equations  simultaneously  the  alternate  minimization  (AM)  approach  (which  we 
seen  in  the  second  chapter)  is  applied.  Since  the  energy  functionals  are  not  convex  in  (u,  v )  we  may  compute 
only  a  local  minimizer.  The  proposed  iterative  algorithm  is: 


Initialization:  u°  =  /,  v°  =  1 


•  Iterate  n  =  0, 1,2, . . .  until  (||rtn+1  —  un||2  <  i'||u”||2) 


1.  Solve  the  equation  for  vn+1  using  Gauss-Seidel  Scheme: 

(2/3</>(||VU)rtn||2)  +a/(2e)  -  2eaA)vn+1  =  a/(2e) 

2.  Set  un+1,°  =  un  and  solve  for  un+1  by  iterating  on  l : 

un+l,l  +  l  =  un+l ,1  _  dt  .  ^(un+l,J  n+1) 
ou 
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where  v  is  a  small  positive  constant. 

The  discretization  of  the  functions  involved  is:  Let  uk  the  value  of  a  pixel  k  in  the  image  (1  <  k  <  N) 
with  channel  i  =  R,G,B  i.e.  the  discretization  of  ul{x),  let  pk  l  be  the  discretized  version  of  pl(x,y)  with 
x,y  £  fl).  Let  Wk,i  the  sparsely  discrete  version  of  the  weight  function  w,  this  is  constructed  by  the  same 
algorithm  described  by  Gilboa  and  Osher  in  chapter  4.  We  follow  the  same  notation  for  neighborhood  sets 
as  in  Gilboa  and  Osher  paper  (and  in  Buades  paper):  l  £  A fk{l  :  Wk}i  >  0}.  Then,  we  define  Vwd  and  divwd, 
the  discretizations  of  Vw  and  divw  as: 


^ wdi^k)  • —  (^1  l  ^ 

divwd(plki)  :=  ( Pk,i  -Pli,k)V™kj 

ieMk 


Then  we  have: 


\P%  =  J  Y^(.P\  i )2  the  magnitude  of  pk  ;at  k 


and  then,  the  discretization  of  ||Vmu||2: 


ivw^ii2  =  E  ivwrf4i2  =  EEw-4)2^ 

ie{R,G,B}  i  l 


For  details  on  the  construction  of  w  we  refer  to  the  algorithm  given  in  Gilboa  and  Osher  paper,  on  chapter 
4. 


G.4  Numerical  Examples 


Figure  G.l:  DENOISING  OR  DEBLURRING  IN  THE  PRESENCE  OF  GAUSSIAN  NOISE.  Top:  (first) 
noisy  image  (/  =  u  +  n)  with  noise  variance  (a  =  0.02),  recovered  images  using  (second)  MSTV  and  (third 
and  fourth)  NL/MSTV  with  edge  set  v.  Bottom:  (first)  original,  (second)  blurry-noisy  data  (/  =  k  *  u  +  n) 
and  noise  variance  (er  =  0.02)  recovered  images  using  (third)  MSTV  and  (fourth)  NL/MSTV. 
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Figure  G.2:  INPAINTING  of  150  x  150)  size  image.  First  column:  (top  row)  original,  (middle)  data  /  . 
Second  and  third  columns:  recovered  images  using  (top)  MSH 1,  MSTV,  (middle)  NL/MSH 1,  NL/MSTV, 
with  51  x  51  search  window  and  9x9  patch.  Bottom  row:  process  of  inpainting  with  NL/MSH 1  in  100th, 
200th,  and  350th  iterations. 


Figure  G.3:  SUPER-RESOLUTION  of  a  still  image.  Top:  original  image  of  size  272  x  272,  blurred  image 
h  *  u  with  out  of  focus  blur  kernel  h  with  radius  r  =  3,  down-sampled  data  /  =  Dk(h  *  u)  of  size  68  x  68 
with  k  =  4,  preprocessed  (up-sampled)  image  q  using  bicubic  interpolation.  Bottom:  recovered  images  using 
(left)  MSTV,  (right)  NL/MSTV. 
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Figure  G.4:  DEMOSAICING  using  NL/MSTV  with  iterative  algorithm.  Top:  original  image,  data  /  , 
interpolated  image  using  Hamilton- Adams  for  green  and  bilinear  interpolation  for  R  —  G  and  B  —  G.  Bottom: 
demosaiced  images  with  decreasing  sequence  of  h  =  {16,8,4} 

G.5  Summary  and  Conclusions 

After  several  testings  to  check  the  performance  of  the  methods  proposed  by  the  authors  and  other  common 
methods  (results  can  be  seen  on  the  Numerical  examples,  and  more  information  can  be  seen  on  the  original 
paper)  the  authors  conclude: 

1.  NL/MSH 1  and  NL/MSTV  outperform  the  local  one  methods  on  all  the  applications. 

2.  For  deblurring-denosing  NL/TV  and  NL/MSTV  provide  the  best  recovered  images. 

3.  NL/MSTV  brings  the  best  results  in  super-resolution. 

4.  NL/MSH 1  and  NL/MSTV  provide  superior  results  to  local  models  by  better  recovering  textures  and 
large  missing  regions. 

5.  NL/MSTV  reconstructs  images  best  in  demosaicing. 

To  sum  up,  NL/MSTV  produces  the  best  results  in  general.  The  next  step  of  the  project  is  to  consider  the 
numerical  implementation  of  these  models. 
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H.  Index  Measurements 


H.l  Camouflage  measure  models 

Camouflage  is  used  to  prevent  detection  of  the  individual  combat  soldier  and  his  equipment  by  both 
visual  observation  and  other  sensors,  and  thereby  to  increase  the  survivability  of  all  soldiers  and  the  mission 
effectiveness.  Camouflage  achieves  its  purposes  by  blending  the  soldier  with  the  background,  and  disrupting 
the  cues  for  perceiving  the  soldier  as  a  single  object  (Killian  &  Hepfinger,  1992). 

Since  the  mid  1990’s,  there  has  been  a  need  to  improve  the  understanding  of  how  low  contrast  targets, 
such  as  camouflage,  are  perceived  by  observers  and  acquired  in  complex  surrounds.  Although  there  has 
been  a  long  history  of  attempting  to  predict  the  probability  of  psychophysical  task  (Detection,  Recognition, 
Identification)  for  target  acquisition  by  means  of  electro-optical  systems.  Most  of  the  models  developed  for 
predicting  psychophysical  task  performance  of  a  given  target  have  been  expressed  in  terms  of  the  model’s 
target  signature  metric  and  an  ensemble  average  of  observers.  These  models  are  not  highly  accurate  in 
predicting  the  psychophysical  performance  of  a  unique  target  in  a  unique  surround  due  to  the  absence  of 
a  sufficiently  adequate  description  of  the  target  signature  and  natural  surround  and  the  interactions  of  the 
target  signature  with  the  local  surround  (Desmond,  2004).  Figure  3  shows  target  acquisition  models  that 
were  developed  for  military  purposes,  with  the  primary  emphasis  on  sensor  performance  (Desmond,  2004) 
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Figure  H.l:  Military  and  Academia  Progress  Toward  Modeling  Observer  Target  Acquisition  Performance 
(Desmond,  2004) 


H.1.1  S-CIELAB  Metric 

The  original  digital  image  is  decomposed  into  a  color  space  based  on  opponent-colors,  and  each  color 
separation  is  convolved  with  a  Gaussian  shaped,  two-dimensional  separable  spatial  filter  to  simulate  the 
blurring  of  the  human  visual  system.  Zhang  and  Wandell  pointed  out  those  psychophysical  experiments 
suggest  the  human  visual  system’s  representation  of  simple  colored  patterns  is  pattern-color  separable.  Thus, 
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the  image  can  be  pre-processed  with  separate  color  and  spatial  filters  which  correspond  to  the  human  visual 
system  prior  to  transformation  to  CIELAB  color  space.  The  delta  E  color  difference  is  calculated  between 
the  original  and  its  copy  to  quantitatively  describe  any  reproduction  error.  Spatially  correlated  color  error 
maps  or  histograms  of  color  error  can  be  generated  using  the  1978  CIELAB  color  space. 
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Figure  H.2:  The  S-CIELAB  Color  Methodology 


H.1.2  The  Sarnoff  Visual  Discrimination  Model 

The  Sarnoff  model,  requires  two  images  as  input  for  analysis  and  produces  an  output  in  terms  of  Just 
Noticeable  Difference  (JND). 

An  image  pair  with  one  JND  has  a  75%  probability  of  discrimination,  two  JNDs  have  has  a  93%  probability 
of  discrimination,  and  three  JNDs  have  a  98%  probability  of  discrimination,  this  model  quantifies  an  observer 
psychophysical  discrimination  task,  i.e.,  measuring  the  difference  of  comparing  two  similar  images,  rather  than 
measuring  an  observer  military  detection  task 

H.2  Patterns  evaluation  system 

H.2.1  MACE  (mobile  army  camouflage  evaluation  Killian  &  Hepfinger  (1992)) 

The  MACE  system  consists  of  hardware  and  software  for  acquiring,  reducing  and  analyzing  images 
of  camouflage-in-background  scenes.  Analysis  with  MACE  system  includes  transformation  from  a  set  of 
monochromatic  images  to  standard  color  description  coordinates,  and  direct  comparisons  of  first  and  second 
order  statistical  measures  between  the  camouflaged  soldier  and  the  local  background. 

H.2. 2  Data  reduction 

Is  a  transformation  from  a  set  of  monochromatic  images  to  standard  color  description  coordinates.  We 
note  that  MACE  system  uses  the  standard  system  CIE  (Commision  Internationale  de  L’Eclairage)  for  color 
matching  built  around  Grassman’s  tristimulus  laws  for  color  mixing: 

/  A"  \  noo  /  ipx(  A)  \ 

y  }=k  R(X)S(X)  <py(X)  dX, 

{  z  J  Jo  {  <pz(X)  ) 

where: 

•  R( X)  is  the  surface  reflectance  of  an  object  under  observation. 

•  S( X)  is  the  spectral  radiance  of  the  illuminant. 
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Sarnoff  JND  Model 


Figure  H.3:  The  Sarnoff  JND  Model 


•  ?(A)  are  the  tristimulus  response  functions  for  standar  observer. 

•  k  is  a  normalization  constant. 

The  data  reduction  process  consists  of  three  steps:  frame  registration,  reflectance  extraction,  and  finally  the 
transformation,  using  the  above  equation. 

H.3  Measures  of  similarity 

Killian  &  Hepfinger  (  1992)  implemented  a  set  of  object /background  similarity  measures  that  are  sensitive 
to  transitions  in  intensity,  color,  and  texture  as  a  basis  for  camouflage  evaluation.  The  measured  the  difference 
in  mean,  standard  deviation,  minimum  and  maximum  value,  coefficient  of  skewness,  coefficient  of  kurtosis, 
and  Bhattacharyya  distance  between  the  first  order  distribution  of  objet  and  background.  They  also  used 
measures  based  on  spatial,  gray  level  co-ocurrence  matrix  (SGLCM)  and  the  spatial  gray  level  difference 
distribution  of  Weszka.  All  of  these  features  are  determined  for  each  color  component  image  over  a  range  in 
scale  space.  The  texture  features  are  also  calculated  for  four  orientations  of  the  displacement  operator.  The 
final  step  of  MACE  system  is  to  rank  order  a  series  of  camouflage  patterns,  based  on  its  features. 

H.3.1  Georgia  Tech  Vision  GTV 

The  Georgia  Tech  Research  Institute  developed  a  simulation  of  human  vision  and  visual  cognition  that  is 
capable  of  automatically  detecting  and  identifying  targets  and  other  types  of  visual  features  in  visible,  IR, 
and  SAR  imagery.  The  simulation,  called  GTV  (Georgia  Tech  Vision),  was  originally  developed  for  the  Army 
AMCOM  for  the  purpose  of  evaluating  camouflage  and  IR  suppression.  It  has  since  proven  to  be  a  powerful 
tool  when  used  in  a  number  of  ATR  scenarios,  including  identification  of  defects  in  agricultural  products, 
finding  areas  of  interest  in  multi-spectral  overhead  imagery,  and  identification  and  classification  of  lesions  in 
biomedical  imagery. 
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H.3.2  Synthetic  Image  Analyst  (SynIA) 

SynIA  is  a  GTV-based  software  that  incorporates  a  model  of  vision  and  visual  cognition.  Like  a  human 
image  analyst,  SynIA  must  experience  each  target  in  a  number  of  different  views  and  contexts  before  it  can 
reliably  discriminate  the  targets  from  other  objects  and  background  clutter. 

The  normal  sequence  of  operations  in  interpreting  images  is: 

•  Analyze  the  both  the  training  and  test  images  into  their  visual  features. 


Figure  H.4:  Examples  of  spatial  frequency  channel  outputs  for  the  Obare  vehicleO  input  image  shown  in 
Figure  5 

•  Train  SynIA  by  using  the  outputs  of  the  SynIA  Analysis  for  the  training  set.  This  generates  target 
knowledge  files.  In  training,  the  targets  are  specified  by  mask  images,  in  which  the  targets  pixels  are 
set  to  non-zero  values  and  other  pixels  are  set  to  zero. 

•  Interpret  the  test  images  by  using  the  outputs  of  SynIA  analysis  and  specifying  the  target  knowledge 
files  generated  by  training  SynIA  with  similar  images. 

In  order  to  quantify  the  degree  of  similarity  between  signature  of  the  various  vehicle  configurations,  it  used 
a  figure  of  merit  called  the  Spatial  Pattern  Angle.  This  measure  is  an  analog  of  the  spectral  angle.  The 
spectral  angle,  commonly  used  in  hyper- spectral  signal  analysis,  is  the  resultant  angle  between  two  points  in 
multi-dimensional  space  made  up  of  the  various  spectral  dimensions.  Analogously,  the  measure  used  in  this 
study  is  the  resultant  angle  between  points.  The  spatial  pattern  angle  is  defined  in  a  hyper-space  defined  by 
the  24  spatial-filter  channel  outputs,  as: 


where  a  and  b  are  vectors  representing  the  24  filter  channel  outputs  for  two  different  target  configurations. 
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Figure  H.5:  Input  images  showing  target  configurations  and  masks. 
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